"data/col_test_data/toy_data_2000.csv").head(2) file_loader(
original_logtime | desc_text | food_type | PID | |
---|---|---|---|---|
0 | 2021-05-12 02:30:00 +0000 | milk | b | yrt1999 |
1 | 2021-05-12 02:45:00 +0000 | some medication | m | yrt1999 |
These functions primarily serve as parts of other functions, but are provided here for utility.
file_loader (data_source:Union[str,pandas.core.frame.DataFrame])
Flexible file loader able to read a single file path or folder path. Accepts .csv and .json file format loading.
Type | Details | |
---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
Returns | pd.DataFrame | A single dataframe consisting of all data matching the provided file or folder path. |
Providing the file loader with a specific file path outputs a single Pandas dataframe generated from that data source.
original_logtime | desc_text | food_type | PID | |
---|---|---|---|---|
0 | 2021-05-12 02:30:00 +0000 | milk | b | yrt1999 |
1 | 2021-05-12 02:45:00 +0000 | some medication | m | yrt1999 |
The file loader can also accept string patterns to read in multiple files at once. Providing a patterened path such as yrt*_food_data*.csv would load all data matching this pattern.
original_logtime | desc_text | food_type | PID | |
---|---|---|---|---|
0 | 2021-05-12 02:30:00 +0000 | Milk | b | yrt1999 |
1 | 2021-05-12 02:45:00 +0000 | Some Medication | m | yrt1999 |
It can also handle reading mixed file types. The below dataframe consists of data read from all .json and .csv files in the data/output/ folder.
ID | unique_code | research_info_id | desc_text | food_type | original_logtime | date | local_time | time | week_from_start | year | cleaned | day_count | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 7572733.0 | alqt14018795225 | 150.0 | Water | w | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 | 17:30:00 | 1.0 | 2017.0 | NaN | NaN |
1 | 411111.0 | alqt14018795225 | 150.0 | Coffee White | b | 2017-12-09 00:01:00+00:00 | 2017-12-08 | 24.016667 | 00:01:00 | 1.0 | 2017.0 | NaN | NaN |
find_date (data_source:Union[str,pandas.core.frame.DataFrame], h:int=4, date_col:int=5)
Extracts date from a datetime column after shifting datetime by ‘h’ hours. A day starts ‘h’ hours early if ‘h’ is negative, or ‘h’ hours later if ‘h’ is positive.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
h | int | 4 | Number of hours to shift the definition for ‘date’ by. h = 4 would shift days so that time membership to each date starts at 4:00 AM and ends at 3:59:59 AM the next calendar day. |
date_col | int | 5 | Column number for existing datetime column in provided data source. Data exported from mCC typically has datetime as its 5th column (with indexing starting from 0). |
Returns | pd.Series | Series of dates in ISO 8601 format. |
By default, find_date expects log dates for studies to begin at 4:00 AM. To use regular calendar dates, remember to set h = 0.
df = file_loader('data/test_food_details.csv')
df['original_logtime'] = pd.to_datetime(df['original_logtime'])
df['date'] = find_date(df, h = 0)
df[['original_logtime', 'date']].head(3)
original_logtime | date | |
---|---|---|
0 | 2017-12-08 17:30:00+00:00 | 2017-12-08 |
1 | 2017-12-09 00:01:00+00:00 | 2017-12-09 |
2 | 2017-12-09 00:58:00+00:00 | 2017-12-09 |
In this example, with log dates starting at the default value of 4 (4:00 AM), we see that two logs from very early morning on 2017-12-09 are counted as being logged on 2017-12-08 instead.
original_logtime | date | |
---|---|---|
0 | 2017-12-08 17:30:00+00:00 | 2017-12-08 |
1 | 2017-12-09 00:01:00+00:00 | 2017-12-08 |
2 | 2017-12-09 00:58:00+00:00 | 2017-12-08 |
Similarly, in an example where we start log days four hours earlier, the last two rows have dates that are shifted so their log date is one day later than their exact calendar datetime.
original_logtime | date | |
---|---|---|
0 | 2017-12-08 17:30:00+00:00 | 2017-12-08 |
1 | 2017-12-09 00:01:00+00:00 | 2017-12-09 |
2 | 2017-12-09 00:58:00+00:00 | 2017-12-09 |
3 | 2018-02-22 21:52:00+00:00 | 2018-02-23 |
4 | 2018-02-22 22:53:00+00:00 | 2018-02-23 |
find_float_time (data_source:Union[str,pandas.core.frame.DataFrame], h:int=4, date_col:int=5)
Extracts time from a datetime column after shifting datetime by ‘h’ hours. A day starts ‘h’ hours early if ‘h’ is negative, or ‘h’ hours later if ‘h’ is positive.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
h | int | 4 | Number of hours to shift the definition for ‘time’ by. h = 4 would allow float representations of time between 4 (inclusive) and 28 (exclusive), representing time that goes from 4:00 AM to 3:59:59 AM the next calendar day. NOTE: h value for this function should match the h value used for generating dates. |
date_col | int | 5 | Column number for existing datetime column in provided data source. Data exported from mCC typically has datetime as its 5th column (with indexing starting from 0). |
Returns | pd.Series | Series of times in float format (e.g. 4:36 AM -> 4.6). |
By default, find_float_time expects studies to begin at 4:00 AM. To preserve regular calendar dates use h = 0.
original_logtime | float_time | |
---|---|---|
0 | 2017-12-08 17:30:00+00:00 | 17.500000 |
1 | 2017-12-09 00:01:00+00:00 | 0.016667 |
2 | 2017-12-09 00:58:00+00:00 | 0.966667 |
Using positive values for h for both date and float time functions changes date ownership for a row based on its original logtime. Float time should be shifted by the same h value as date membership so that times belonging to a different calendar date can be differentiated when necessary (e.g. 2:00 AM –> 2.0, whereas 2:00 AM the next calendar day –> 26.0, for cases where these rows should still be grouped together on the same logging date).
df['float_time'] = find_float_time(df, h = 4)
df['date'] = find_date(df, h = 4)
df[['original_logtime','date', 'float_time']].head(3)
original_logtime | date | float_time | |
---|---|---|---|
0 | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 |
1 | 2017-12-09 00:01:00+00:00 | 2017-12-08 | 24.016667 |
2 | 2017-12-09 00:58:00+00:00 | 2017-12-08 | 24.966667 |
In rare cases, it may be valuable to shift date and time by negative values. In this example where a log date starts at 8:00 PM the previous calendar day and ends at 8:00 PM the current calendar day, note that the last two rows have negative float times and their date membership is shifted one date further than their original calendar datetime.
df['float_time'] = find_float_time(df, h = -4)
df['date'] = find_date(df, h = -4)
df[['original_logtime','date', 'float_time']].head(5)
original_logtime | date | float_time | |
---|---|---|---|
0 | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 |
1 | 2017-12-09 00:01:00+00:00 | 2017-12-09 | 0.016667 |
2 | 2017-12-09 00:58:00+00:00 | 2017-12-09 | 0.966667 |
3 | 2018-02-22 21:52:00+00:00 | 2018-02-23 | -2.133333 |
4 | 2018-02-22 22:53:00+00:00 | 2018-02-23 | -1.116667 |
week_from_start (data_source:Union[str,pandas.core.frame.DataFrame], identifier:int=1)
Calculates the number of weeks between each logging entry and the first logging entry for each participant. A ‘date’ column must exist in the provided data source. Using the provided find_date function is recommended.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
Returns | np.array | Array of weeks passed from log date to the minimum date for each participant. |
Using find_date to ensure that a date column exists in the data source is recommended. A column labeled ‘date’ is a requirement of this function.
df['date'] = find_date(df)
df['week_from_start'] = week_from_start(df)
df[['unique_code','original_logtime','week_from_start']][2:4]
unique_code | original_logtime | week_from_start | |
---|---|---|---|
2 | alqt14018795225 | 2017-12-09 00:58:00+00:00 | 1 |
3 | alqt14018795225 | 2018-02-22 21:52:00+00:00 | 11 |
find_phase_duration (df:pandas.core.frame.DataFrame)
Calculates the duration (in days) of the study phase for each row.
Type | Details | |
---|---|---|
df | pd.DataFrame | Participant information dataframe with columns for start and ending date for that row’s study phase. The expected column numbers for starting and ending dates are outlined in the HOWTO document that accompanies TREETS. |
Returns | pd.DataFrame | Dataframe with an additional column describing study phase duration. |
find_phase_duration(pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx'))[['phase_duration']]
phase_duration | |
---|---|
0 | 3 days |
1 | 4 days |
2 | 3 days |
3 | 4 days |
4 | NaT |
load_food_data (data_source:Union[str,pandas.core.frame.DataFrame], h:int, identifier:int=1, datetime_col:int=5)
Loads and processes existing logging data, adding specific datetime information in formats more suitable for TREETS functions.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
h | int | Number of hours to shift the definition of ‘date’ by. h = 4 would indicate that a log date begins at 4:00 AM and ends the following calendar day at 3:59:59 AM. Float representations of time would therefore go from 4.0 (inclusive) to 28.0 (exclusive) to represent ‘date’ membership for days shifted from their original calendar date. |
|
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
datetime_col | int | 5 | Column number for an existing datetime column in provided data source. Data exported from mCC typically has datetime as its 5th column (with indexing starting from 0). |
Returns | pd.DataFrame | Dataframe with additional date, float time, and week from start columns. |
ID | unique_code | research_info_id | desc_text | food_type | original_logtime | date | float_time | time | week_from_start | year | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 7572733 | alqt14018795225 | 150 | Water | w | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 | 17:30:00 | 1 | 2017 |
1 | 411111 | alqt14018795225 | 150 | Coffee White | b | 2017-12-09 00:01:00+00:00 | 2017-12-08 | 24.016667 | 00:01:00 | 1 | 2017 |
in_good_logging_day (data_source:Union[str,pandas.core.frame.DataFrame], min_log_num:int=2, min_separation:int=5, identifier:int=1, date_col:int=6, time_col:int=7)
Calculates if each log is considered to be within a ‘good logging day’. A log day is considered ‘good’ if there are at least the minimum number of required logs, with a minimum specified hour separation between the first and last log for that log date. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 5 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | np.array | Boolean array describing whether each log is a ‘good’ logging day. |
df = load_food_data('data/test_food_details.csv', h = 4)
df['in_good_logging_day'] = in_good_logging_day(df)
df.head(2)
ID | unique_code | research_info_id | desc_text | food_type | original_logtime | date | float_time | time | week_from_start | year | in_good_logging_day | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 7572733 | alqt14018795225 | 150 | Water | w | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 | 17:30:00 | 1 | 2017 | True |
1 | 411111 | alqt14018795225 | 150 | Coffee White | b | 2017-12-09 00:01:00+00:00 | 2017-12-08 | 24.016667 | 00:01:00 | 1 | 2017 | True |
FoodParser ()
Food parser handles taking unprocessed food log entries and adding relevant information from a pre-made dictionary. This includes matching unprocessed terms to their likely matches, adding food type and other identifying information.
clean_loggings (data_source:Union[str,pandas.core.frame.DataFrame], identifier:int=1)
Cleans and attempts typo correction for all logging text entries.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
Returns | pd.DataFrame | Dataframe with an additional column containing cleaned and typo corrected item entries. |
Text descriptions of food items are cleaned using a built-in dictionary of common typos and corrections for each phrase. Phrases are then matched using a dictionary of known n-gram item names. The resulting item(s) are provided as a list.
unique_code | desc_text | cleaned | |
---|---|---|---|
0 | alqt14018795225 | Water | [water] |
1 | alqt14018795225 | Coffee White | [coffee, white] |
2 | alqt14018795225 | Salad | [salad] |
get_types (data_source:Union[str,pandas.core.frame.DataFrame], food_type:Union[str,list])
Filters logs for only logs of specified type(s).
Type | Details | |
---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. A column ‘food_type’ is required to be within the data. |
food_type | str | list | A single food type, or list of food types. Valid types are ‘f’: food, ‘b’: beverage, ‘w’: water, and ‘m’: medication. |
Returns | pd.DataFrame | Dataframe filtered for only logs of specific type(s). |
Type selection accepts multiple types at once as a list of entry types. All types chosen must be valid.
Available food types include:
‘f’: Food
‘b’: Beverage
‘w’: Water
‘m’: Medication
Flavored water beverages such as La Croix are counted as ‘water’ and not as ‘beverage’.
unique_code | desc_text | food_type | |
---|---|---|---|
0 | alqt14018795225 | Water | w |
2 | alqt14018795225 | Salad | f |
3 | alqt78896444285 | Water | w |
Filtering for a single type is also possible.
df = load_food_data('data/test_food_details.csv', h = 4)
get_types(df, 'm')[['unique_code','desc_text','food_type']].head(3)
unique_code | desc_text | food_type | |
---|---|---|---|
323 | alqt14018795225 | Caffeine | m |
361 | alqt14018795225 | Caffeine | m |
420 | alqt14018795225 | Caffeine | m |
count_caloric_entries (df:pandas.core.frame.DataFrame)
Counts the number of food (‘f’) and beverage (‘b’) loggings.
Type | Details | |
---|---|---|
df | pd.DataFrame | Dataframe of food logging data. |
Returns | int | Number of caloric (food or beverage) entries found. |
mean_daily_eating_duration (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates mean daily eating window by taking the average of each day’s eating window. An eating window is defined as the duration of time between first and last caloric (food or beverage) intake. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of average daily eating window duration. |
14.038679245283017
std_daily_eating_duration (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the standard deviation of the daily eating window. An eating window is defined as the duration of time between first and last caloric (food or beverage) intake. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of the standard deviation of daily eating window duration. |
7.018679942775867
earliest_entry (df:pandas.core.frame.DataFrame, time_col:int=7)
Calculates the earliest recorded caloric (food or beverage) entry. It is recommended that you use find_float_time to generate necessary the time column for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of the earliest logtime on any date. |
mean_first_cal (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the average time of first caloric intake. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of average first caloric entry time. |
# find the average mean first cal time for each participant
df.groupby(['unique_code']).agg(mean_first_cal, date_col = 6, time_col = 7).iloc[:,0]
unique_code
alqt1148284857 7.315278
alqt14018795225 7.635938
alqt16675467779 6.153904
alqt21525720972 13.211957
alqt45631586569 15.056295
alqt5833085442 12.551515
alqt62359040167 7.252137
alqt6695047873 7.573077
alqt78896444285 6.347510
alqt8668165687 9.702555
Name: ID, dtype: float64
std_first_cal (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the standard deviation for time of first caloric intake. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of the standard deviation of first caloric entry time. |
mean_last_cal (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the average time of last caloric intake. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of average last caloric entry time. |
std_last_cal (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the standard deviation for time of last caloric intake. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | float | Float representation of the standard deviation of last caloric entry time. |
4.359435007580498
mean_daily_eating_occasions (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the average number of daily eating occasions. An eating occasion is a single caloric (food or beverage) log. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | int | Average number of daily eating occasions. |
df = load_food_data('data/test_food_details.csv', h = 4)
mean_daily_eating_occasions(df, 'date', 'float_time')
6.8915094339622645
std_daily_eating_occasions (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the standard deviation of the number of daily eating occasions. An eating occasion is a single caloric (food or beverage) log. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | int | Standard deviation of the number of daily eating occasions. |
df = load_food_data('data/test_food_details.csv', h = 4)
std_daily_eating_occasions(df, 'date', 'float_time')
4.44839423402741
mean_daily_eating_midpoint (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the average daily midpoint eating occasion time. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | int | Float representation of the average daily midpoint eating occasion time. |
df = load_food_data('data/test_food_details.csv', h = 4)
mean_daily_eating_midpoint(df, 'date', 'float_time')
16.536425576519914
std_daily_eating_midpoint (df:pandas.core.frame.DataFrame, date_col:int=6, time_col:int=7)
Calculates the standard deviation of the daily midpoint eating occasion time. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘food_type’ must exist within the data. | |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | int | Float representation of the standard deviation of the daily midpoint eating occasion time. |
df = load_food_data('data/test_food_details.csv', h = 4)
std_daily_eating_midpoint(df, 'date', 'float_time')
4.107072970435106
logging_day_counts (df:pandas.core.frame.DataFrame)
Calculates the number of days that contain any logs. It is recommended that you use find_date to generate the necessary date column for this function.
Type | Details | |
---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column for ‘date’ must exist within the data. |
Returns | int | Number of days with at least one log on that day. |
find_missing_logging_days (df:pandas.core.frame.DataFrame, start_date:datetime.date='not_defined', end_date:datetime.date='not_defined')
Finds days that have no log entries between a start (inclusive) and end date (inclusive). It is recommended that you use find_date to generate the necessary date column for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. | |
start_date | datetime.date | not_defined | Starting date for missing day evaluation. By default the earliest date in the data will be used. |
end_date | datetime.date | not_defined | Ending date for missing day evaluation. By default the latest date in the data will be used. |
Returns | list | List of days within the given timeframe that have no log entries. |
The phrase ‘not_defined’ is the intended default value for start and end dates to signify that the earliest and/or latest date within the data should be used. If a participant is missing a valid start or end date, null is returned.
df = load_food_data('data/test_food_details.csv', h = 4)
find_missing_logging_days(df, datetime.date(2017, 12, 7), datetime.date(2017, 12, 10))
[datetime.date(2017, 12, 7),
datetime.date(2017, 12, 9),
datetime.date(2017, 12, 10)]
ID | unique_code | research_info_id | desc_text | food_type | original_logtime | date | float_time | time | week_from_start | year | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 7572733 | alqt14018795225 | 150 | Water | w | 2017-12-08 17:30:00+00:00 | 2017-12-08 | 17.500000 | 17:30:00 | 1 | 2017 |
1 | 411111 | alqt14018795225 | 150 | Coffee White | b | 2017-12-09 00:01:00+00:00 | 2017-12-08 | 24.016667 | 00:01:00 | 1 | 2017 |
2 | 8409118 | alqt14018795225 | 150 | Salad | f | 2017-12-09 00:58:00+00:00 | 2017-12-08 | 24.966667 | 00:58:00 | 1 | 2017 |
good_lwa_day_counts (df:pandas.core.frame.DataFrame, window_start:datetime.time, window_end:datetime.time, min_log_num:int=2, min_separation:int=5, buffer_time:str='15 minutes', h:int=4, start_date:datetime.date='not_defined', end_date:datetime.date='not_defined', time_col:int=7)
Calculates the number of ‘good’ logging days, ‘good’ window days, ‘outside’ window days and adherent days.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. | |
window_start | datetime.time | Starting time for a time restriction window. | |
window_end | datetime.time | Ending time for a time restriction window. | |
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 5 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
buffer_time | str | 15 minutes | pd.Timedelta parsable string, representing ‘wiggle room’ for adherence. |
h | int | 4 | Number of hours to shift the definition of ‘date’ by. h = 4 would indicate that a log date begins at 4:00 AM and ends the following calendar day at 3:59:59. Float representations of time would therefore go from 4.0 (inclusive) to 28.0 (exclusive) to represent ‘date’ membership for days shifted from their original calendar date. |
start_date | datetime.date | not_defined | Starting date for missing day evaluation. By default the earliest date in the data will be used. |
end_date | datetime.date | not_defined | Ending date for missing day evaluation. By default the latest date in the data will be used. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | tuple[list, list] | List containing number of ‘good’ logging days, ‘good’ window days, ‘outside’ window days, and adherent days. List of three lists. The lists contains dates that are not considered ‘good’ logging days, ‘good’ window days, or adherent days (in that order). |
The main use of this function is to calculate window and logging adherence. These are represented as ‘good’ (valid) logging days, ‘good’ window days, ‘outside’ (invalid) window days, and adherent days.
The definition of each is:
‘Good’ Logging Day
‘Good’ Window Day
Adherent Day
The second product of this function is three lists that outline which days are not compliant with one of the definitions above. The first list (index 0) consists of dates that are not ‘good’ logging days, the second contains days that are not ‘good’ window days. The final list consists of dates that are not adherent (neither ‘good’ window nor ‘good’ logging dates).
This group of functions provides methods for filtering participant data.
filtering_usable_data (df:pandas.core.frame.DataFrame, num_items:int, num_days:int, identifier:int=1, date_col:int=6)
Filters data for only participants who’s data satisfies the minimum number of days and logs. It is recommended that you use find_date to generate the necessary date column for this function.
Type | Default | Details | |
---|---|---|---|
df | pd.DataFrame | Dataframe of food logging data. A column ‘desc_text’, typically found in mCC data is required. |
|
num_items | int | Minimum number of logs required to pass filter criteria. | |
num_days | int | Minimum number of unique logging days required to pass filter criteria. | |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
Returns | tuple[pd.DataFrame, set] | Data filtered to only include data from participants that have passed filtering criteria. Set of participants that passed filtering criteria. |
prepare_baseline_and_intervention_usable_data (data_source:Union[str,pan das.core.frame.DataFrame], baseline_num_items:int, baseline_num_days:int, int ervention_num_items:int, intervention_num_days:int, identifier:int=1, date_col:int=6)
Filters data for ‘usable’ data within baseline and last two weeks of intervention (weeks 13 and 14). It is recommended that you use the function ‘week_from_start’ to generate the necessary week column for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
baseline_num_items | int | Number of logs for a participant’s baseline data to pass filter criteria. | |
baseline_num_days | int | Number of unique logging days for a participant’s baseline data to pass filter criteria. | |
intervention_num_items | int | Number of logs for a participant’s intervention data to pass filter criteria. | |
intervention_num_days | int | Number of unique logging days for a participant’s intervention data to pass filter criteria. | |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
Returns | list | List of two dataframes: usable baseline data, usable intervention data. |
Data analysis and summary functions, including summary functions for specific statistics.
users_sorted_by_logging (data_source:Union[str,pandas.core.frame.DataFra me], food_type:list=['f', 'b', 'm', 'w'], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Reports the number of ‘good’ logging days for each user, in descending order based on number of ‘good’ logging days.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
food_type | list | [‘f’, ‘b’, ‘m’, ‘w’] | A single food type, or list of food types. Valid types are ‘f’: food, ‘b’: beverage, ‘w’: water, and ‘m’: medication. |
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | pd.DataFrame | Dataframe containing the number of good logging days for each user. |
eating_intervals_percentile (data_source:Union[str,pandas.core.frame.Dat aFrame], identifier:int=1, time_col:int=7)
Calculates the 2.5, 5, 10, 12.5, 25, 50, 75, 87.5, 90, 95, and 97.5 percentile eating time for each participant. It also calculates the middle 95, 90, 80, 75, and 50 percentile eating windows for each participant. It is recommended that you use find_float_time to generate necessary the time column for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | pd.DataFrame | Dataframe with count, mean, std, min, quantiles and mid XX%tile eating window durations for all participants. |
first_cal_analysis_summary (data_source:Union[str,pandas.core.frame.Data Frame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Calculates the 5, 10, 25 , 50, 75, 90, 95 percentile of first caloric entry time for each participant on ‘good’ logging days. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | pd.DataFrame | Dataframe with 5, 10, 25, 50, 75, 90, 95 percentile of first caloric entry time for all participants. |
last_cal_analysis_summary (data_source:Union[str,pandas.core.frame.DataF rame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Calculates the 5, 10, 25 , 50, 75, 90, 95 percentile of last caloric entry time for each participant on ‘good’ logging days. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | pd.DataFrame | Dataframe with 5, 10, 25, 50, 75, 90, 95 percentile of last caloric entry time for all participants. |
summarize_data (data_source:Union[str,pandas.core.frame.DataFrame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Summarizes participant data, including number of days, total number of logs, number of food/beverage logs, number of medication logs, number of water logs, eating window duration information, first and last caloric log information, and adherence.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column (with indexing starting from 0). |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | pd.DataFrame | Summary dataframe. |
This function provides summary data for an entire study, without separating for study phases. Summaries include statistics for first and last caloric log, eating window, and relevant calculations for middle 95 percentile eating window.
summarize_data_with_experiment_phases (food_data:pandas.core.frame.DataF rame, ref_tbl:pandas.core.frame.Da taFrame, min_log_num:int=2, min_separation:int=5, buffer_time:str='15 minutes', h:int=4, report_level:int=2, txt:bool=False)
Summarizes participant data for each experiment phase and eating window assignment. Summary includes number of days, total number of logs, number of food/beverage logs, number of medication logs, number of water logs, eating window duration information, first and last caloric log information, and adherence.
Type | Default | Details | |
---|---|---|---|
food_data | pd.DataFrame | Dataframe of food logging data. A column for “original_logtime” must exist within the data. mCC output style data is expected. |
|
ref_tbl | pd.DataFrame | Participant data reference table. See the accompanying HOWTO document for required column positions and formatting. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 5 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
buffer_time | str | 15 minutes | pd.Timedelta parsable string, representing ‘wiggle room’ for adherence. |
h | int | 4 | Number of hours to shift the definition of ‘date’ by. h = 4 would indicate that a log date begins at 4:00 AM and ends the following calendar day at 3:59:59. Float representations of time would therefore go from 4.0 (inclusive) to 28.0 (exclusive) to represent ‘date’ membership for days shifted from their original calendar date. |
report_level | int | 2 | Additional printed info detail level. 0 = No Report. 1 = Report ‘No Logging Days’. 2 = Report ‘No Logging Days’, ‘Bad Logging Days’, ‘Bad Window Days’, and ‘Non-Adherent Days’. |
txt | bool | False | If True, a text format (.txt) report will be saved in the current directory, with the name ‘treets_warning_dates.txt’ |
Returns | pd.DataFrame | Summary dataframe, where each row represents the summary for a participant during a particular study phase. Participants can have multiple rows for a single study phase if, during that study phase, their assigned eating window is altered. |
Plotting functions.
first_cal_mean_with_error_bar (data_source:Union[str,pandas.core.frame.D ataFrame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Represents mean and standard deviation of first caloric intake time for each participant as a scatter plot, with participants as the x-axis and time as the y-axis. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
last_cal_mean_with_error_bar (data_source:Union[str,pandas.core.frame.Da taFrame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Represents mean and standard deviation of last caloric intake time for each participant as a scatter plot, with the x-axis as participants and the y-axis as time. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
first_cal_analysis_variability_plot (data_source:Union[str,pandas.core.f rame.DataFrame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Calculates first caloric log time variability for ‘good’ logging days by subtracting 5, 10, 25, 50, 75, 90, 95 percentile of first caloric intake time from the 50th percentile first caloric intake time. It also produces a histogram that represents the 90%-10% interval for all participants. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
last_cal_analysis_variability_plot (data_source:Union[str,pandas.core.fr ame.DataFrame], min_log_num:int=2, min_separation:int=4, identifier:int=1, date_col:int=6, time_col:int=7)
Calculates last caloric log time variability for ‘good’ logging days by subtracting 5, 10, 25, 50, 75, 90, 95 percentile of last caloric intake time from the 50th percentile last caloric intake time. It also produces a histogram that represents the 90%-10% interval for all participants. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
min_log_num | int | 2 | Minimum number of logs required for a day to be considered a ‘good’ logging day. |
min_separation | int | 4 | Minimum number of hours between first and last log on a log day for it to be considered a ‘good’ logging day. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
first_cal_avg_histplot (data_source:Union[str,pandas.core.frame.DataFram e], identifier:int=1, date_col:int=6, time_col:int=7)
Plots a histogram of average first caloric intake for all participants. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
first_cal_sample_distplot (data_source:Union[str,pandas.core.frame.DataF rame], n:int, replace:bool=False, identifier:int=1, date_col:int=6, time_col:int=7)
Creates a distplot for the first caloric intake time for a random selection of ‘n’ number of participants. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
n | int | Number of participants to plot for, selected randomly without replacement. | |
replace | bool | False | If true, samples with replacement. Samples without replacement by default. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
last_cal_avg_histplot (data_source:Union[str,pandas.core.frame.DataFrame ], identifier:int=1, date_col:int=6, time_col:int=7)
Plots a histogram of average last caloric intake for all participants. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
last_cal_sample_distplot (data_source:Union[str,pandas.core.frame.DataFr ame], n:int, replace:bool=False, identifier:int=1, date_col:int=6, time_col:int=7)
Creates a distplot for the last caloric intake time for a random selection of ‘n’ number of participants. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
n | int | Number of participants to plot for, selected randomly without replacement. | |
replace | bool | False | If true, samples with replacement. Samples without replacement by default. |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |
swarmplot (data_source:Union[str,pandas.core.frame.DataFrame], max_loggings:int, identifier:int=1, date_col:int=6, time_col:int=7)
Creates a swarmplot for participants logging data. It is recommended that you use find_date and find_float_time to generate necessary date and time columns for this function.
Type | Default | Details | |
---|---|---|---|
data_source | str | pd.DataFrame | String file or folder path. Single .json or .csv paths create a pd.DataFrame. Folder paths with files matching the input pattern are read together into a single pd.DataFrame. Existing dataframes are read as is. Must have a column for ‘food_type’ within the data. |
|
max_loggings | int | Maximum number of randomly selected logs to be plotted for each participant. | |
identifier | int | 1 | Column number for an existing unique identifier column in provided data source. Data exported from mCC typically has a unique identifier as its 1st column. |
date_col | int | 6 | Column number for an existing date column in provided data source. |
time_col | int | 7 | Column number for an existing time column in provided data source. |
Returns | matplotlib.figure.Figure | Matplotlib figure object. |