Data Filter / Trail Selections

Select specific trails and labels for the problem of binary classifications

To prepare the data into a machine learning model ready format, we need to do the following preprocessing steps. Along the steps, we need to provide some information regarding the positive and negative class.

# imports
from EEG_Familiarity import preproc

Specified the file path, and instantiate a preproc object.

file_path = "../data/data_CRMN_vs_MMN_imbalLDA_order_proj_1.mat"
data_preproc = preproc(file_path, experiment_num=1)
data_preproc
<EEG_Familiarity.preproc.preproc>

Specify the positive class and negative class index via preproc.filter_index. For the numbering system, please refer to Data Format

pos1, neg1 = data_preproc.filter_index(2,5,2,4)
pos2, neg2 = data_preproc.filter_index(4,5,4,4)

Based on the filter, we can do an inner merge operation between two class using preproc.merge_two_class. After merging, we can get the data directly using preproc.get_data_by_index.

pos_idx, neg_idx = data_preproc.merge_two_class(pos1, neg1, pos2, neg2)
X, y, subject = data_preproc.get_data_by_index(pos_idx, neg_idx)

By doing so, we constructed the \(\mathbb{X}\), \(\mathbb{y}\) from the data of this specific classifier, along with a subject identifiers.

X.shape, y.shape, subject.shape
((3813, 72), (3813,), (3813,))

From the output, this particular classifier has \(3813\) observation and \(72\) dimensional features associate with them.