Commit 96b55c0c authored by Robin Quillivic's avatar Robin Quillivic

add qmd data description

parent 3d7b36f1
---
title: "Dataset Description for emotional and psychological response paper"
subtitle: "Data from "Etude 1000", programme 13-Novembre"
author : "Robin Quillivic, PhD student"
jupyter : emo
date: "2023-06-01"
execute:
echo: false
format:
pdf: default
html:
self-contained: true
grid:
margin-width: 100px
---
# Introduction
This document describe precisly the transformation of the raw data into the final dataset used for the analysis of the emotional and psychological response of the participants from Etude 1000.
# Data
## Interview related features
| Features name | Unit | Description | Remarks |
|---------------|------|-------------|----------|
| code_enqueteur | str | unique id associated with interviewer | |
| interview_date | date | date of the interview of p1 | |
| participation_p2 | int | participation to p2 | |
| interview_location | str | where interview took place | |
{{< pagebreak >}}
## Sociodemographic features
| Features name | Unit | Description | Remarks |
|---------------|------|-------------|----------|
| age | int | age of participant | ex: 22 |
| age_norm | float | age of participant normalize by max | ex: 0.43 |
| birthdate | date | birthdate of participant | |
| sexe | str | M or F for male or female | |
| sexe_enc | int | 1 for M | |
| code_insee | str | profession categories associated with participant | |
| code_insee_fr | str | profession categories associated with participant in french | |
| code_insee_enc | str | ordinal encoding of code_insee | |
| education_level | str | number of year after or before Bac | ex: Bac+2 |
| education_level_enc | int | number of year after or before Bac | 2 |
| education_degree | str | Degree name in anglo-saxon reference| ex: Bachelor |
| marital_status | str | Marital status of participant | ex: Single |
| single | int | single or not (1 or 0) | ex: 1 |
| child_number | int | number of children | ex: 1 |
| residence_mode | str |mode of residence | ex: In FLATSHARE |
| living_alone | int |living alon,e or not | ex: 1 |
### Distribution of some features
```{python}
import pandas as pd
from IPython.display import display,Markdown
import warnings
warnings.filterwarnings('ignore')
data = pd.read_csv('/home/robin/Data/Etude_1000/20230601_socio_and_emotional_data.csv', sep=';', encoding='utf-8')
for col in ['code_insee', "education_degree","marital_status", "residence_mode"] :
Markdown(f"### {col}")
display(data[col].value_counts())
```
{{< pagebreak >}}
## Expostion related features
| Features name | Unit | Description | Remarks |
|---------------|------|-------------|----------|
| exp_cercle | int | cercle of exposition | |
| exp_critereA | str |based on DSM-V and category of testimony | ex : A4 |
| exp_exposition | int | Exposed or not | ex :1 |
| exp_testimony_category | str | raw Etude 1000 data P1_0_9_1 | |
| history_personal | int | presence or absence of familly history with trauma or attentat | ex: 1 |
| history_family | int | presence or absence of personal history with trauma or attentat | ex :0 |
```{python}
for col in ["exp_critereA","exp_cercle"] :
Markdown(f"### {col}")
display(data[col].value_counts())
```
{{< pagebreak >}}
## Media and communcation related features
| Features name | Unit | Description | Remarks |
|---------------|------|-------------|----------|
| media_last_month | str | response to III_5_1 | ex: more than once a week |
| media_last_month_enc | int |bordinal encoding of media_last_month | ex : 1 |
| media_nb_hour_13_14 | int | number of hour of media consumption in the night of 13-14th | ex :10 |
| media_cat_hour_13_14 | str | categorial encoding of media_nb_hour_13_14| ex: between 5h and 10h |
| communication_last_month | str | pnumber of communication related to the evt,n in last month | ex: "more than onse a week" |
| communication_last_month_enc | int | ordinal encoding of communication_last_month | ex :1 |
| communication_13_14 | int | number of communication in the night of 13-14th | ex :15 |
### Some distribution related to Media and communciation
```{python}
for col in ["media_last_month"] :
Markdown(f"### {col}")
display(data[col].value_counts())
```
{{< pagebreak >}}
## Memory accuracy related features
| Features name | Unit | Description | Remarks |
|---------------|------|-------------|----------|
| estimation_nb_death | int | estimation of number of death from participant | ex: 150 |
| estimation_nb_death_correct | int | is the estimation correct ? | ex: 1 |
| estimation_nb_death_correct _cat | str | Is the estimation is over or under estimated ? | ex: "strong_over_estimation" |
| estimation_nb_attackers | int | estimation of number of attackers from participant | ex: 3 |
| estimation_nb_attackers _correct | int | is the estimation correct ? | ex: 1 |
| memory_before_event | str | accuracy of memory before the event | ex: "I remember_precisly" |
| memory_before_event_enc | int | Ordinal encoding of memory_before_event | ex: 1" |
| memory_after_event | str | accuracy of memory after the event | ex: "I remember_precisly" |
| memory_after_event_enc | int | Ordinal encoding of memory_after_event | ex: 1" |
| memory_event | str | accuracy of memory of the event | ex: "I remember_precisly" |
| memory_event_enc | int | Ordinal encoding of memory_event | ex: 1" |
| memory_hour_event | str | accuracy of memory of the hour of the event | ex: "I remember_precisly" |
| memory_hour_event_enc | int | Ordinal encoding of memory_hour_event | ex: 1" |
| memory_crash_tgv_after _event | str | accuracy of memory of the crash of the TGV 2 days after the event | ex: "I remember_precisly" |
| memory_crash_tgv_after _event_enc | int | Ordinal encoding of memory_crash_tgv_after_event | ex: 1" |
| memory_crash_plane _before_event | str | accuracy of memory of the crash of the plane in March 2015 | ex: "I remember_precisly" |
| memory_crash_plane_before _event_enc | int | Ordinal encoding of memory_crash_plane_before_event | ex: 1" |
### Some distribution related to memory
```{python}
for col in ["estimation_nb_death_correct_cat","memory_hour_event"] :
Markdown(f"### {col}")
display(data[col].value_counts())
```
{{< pagebreak >}}
## Psychological features
This features were computed using psychatrist expertise and 14 response of the emotional questionaries.
| Features name | Unit | Description | Remarks |
|---------------|------|-------------|----------|
| diagnosis_confidence_score | float | confidence of the diagnosis | ex: 0.8 |
|PTSD_probable | int | is the participant is probable PTSD | ex: 1 |
| partial_PTSD_probable | int | is the participant is probable partial PTSD | ex: 1 |
| full_or_partial_PTSD _probable | int | is the participant is probable full or partial PTSD | ex: 1 |
| CB_probable | int | is the participant is probable CB, intruisins symptoms | ex: 1 |
|CC_probable | int | is the participant is probable CC, avoidance symptoms | ex: 1 |
| CD_probable | int | is the participant is probable CD | ex: 1 |
| CD_probable_depression | int | is the participant is probable CD with depression | ex: 1 |
| CD_probable_dissociation | int | is the participant is probable CD with dissociation | ex: 1 |
| CE_probable | int | is the participant is probable CE , hyperarousal symptoms | ex: 1 |
| CG_probable | int | is the participant is probable CG | ex: 1 |
{{< pagebreak >}}
## Emotional features
List of emotions selected by the participants at this questions:
> II_5_1_P1_II_5_1_QUELLES_SONT_LES_EMOTIONS_QUE_VOUS_ RESSENTEZ_QUAND_VOUS_PENSEZ_AUX_EVENEMENTS_DU_13_NOVEMBRE_ AU_COURS_DU_MOIS_QUI_VIENT_DE_S_ECOULER
> II_5_1_P1_II_5_1_WHAT_ARE_THE_EMOTIONS_YOU_FEEL_WHEN_YOU_ THINK_ABOUT_THE_EVENTS_OF_13_NOVEMBER_ IN_THE_MONTH_THAT _HAS_JUST_PASSED
### Simple Emotion distribution
```{python}
for col in ['Surprise',
'Joy',
'Anger',
'Stunned',
'Sadness',
'Satisfaction',
'Emphathy',
'Interest',
'Fear',
'Incomprehension',
'Disgust'] :
Markdown(f"### {col}")
display(data[col.lower()].value_counts(normalize=True))
```
All the combination of 2 emotions are also computed. For example *surprise_joy*, *surprise_anger*, *surprise_stunned*, *surprise_sadness* etc...
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment