Skip to content

compute_noise_ceiling 🗿

Compute the noise ceiling for human similarity judgments.

In terms of:

* regression coefficients
* prediction accuracy

Functions:

Name Description
check_gender_suffix

Check the gender suffix.

check_sampling_mode

Check the sampling mode.

compute_cross_session_accuracy

Compute the cross-session accuracy.

compute_maximal_empirical_accuracy_in_session

Compute the maximal empirical accuracy (MEA) for a given session.

compute_maximal_empirical_accuracy_in_triplet

Compute the maximal empirical accuracy (MEA) within a resampled triplet.

get_mea_table

Get the maximal empirical accuracy (MEA) table.

get_mer_table

Get the maximal empirical R (MER) table.

get_r_stats

Get min, mean, max, and variance from a list of correlation coefficients.

get_trial_tables

Get the trial results table for each session.

save_mea_table

Save the maximal empirical accuracy (MEA) table.

save_mer_table

Save the maximal empirical R (MER) table.

statistical_difference_maximal_empirical_accuracy_between_sessions

Run a significance test of differences between maximal empirical accuracies (MEA) for the two sessions (2D, 3D).

check_gender_suffix 🗿

check_gender_suffix(suffix: str | None) -> str

Check the gender suffix.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
64
65
66
67
68
69
70
71
72
73
def check_gender_suffix(suffix: str | None) -> str:
    """Check the gender suffix."""
    if suffix:
        suffix = suffix.lower()
        if suffix not in params.GENDERS:
            msg = "suffix must be 'female' OR 'male'!"
            raise ValueError(msg)
        return f"_{suffix}"

    return ""

check_sampling_mode 🗿

check_sampling_mode(sampling_mode: str) -> str

Check the sampling mode.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
55
56
57
58
59
60
61
def check_sampling_mode(sampling_mode: str) -> str:
    """Check the sampling mode."""
    sampling_mode = sampling_mode.lower()
    if sampling_mode not in {"full-sample", "multi-sub-sample"}:
        msg = "Mode must be either 'full-sample' or 'multi-sub-sample'!"
        raise ValueError(msg)
    return sampling_mode

compute_cross_session_accuracy 🗿

compute_cross_session_accuracy(
    trial_results_table_2d: DataFrame,
    trial_results_table_3d: DataFrame,
    sampling_mode: str,
    gender_suffix: str = "",
) -> None

Compute the cross-session accuracy.

Question

"Can one predict from trials in the 2D-condition trials in the 3D-condition?"

In case there are multiple samples of the same tripled ID, the most frequent choice is taken.

Minimal impact of the random choices

The partially random choices during the comparison between the two viewing conditions (below) can lead to slightly different results in the 'match' column, when this would be run again. Nonetheless, the contribution of these random choices should be small, since they should cancel each other out (no-match vs. match) in terms of their impact on the overall accuracy.

Parameters:

Name Type Description Default
trial_results_table_2d DataFrame

trial results table of the 2D-session

required
trial_results_table_3d DataFrame

trial results table of the 3D-session

required
sampling_mode str

"full-sample" or "multi-sub-sample"

required
gender_suffix str

gender suffix "female" or "male" (if applicable) else empty string ""

''
Source code in code/facesim3d/modeling/compute_noise_ceiling.py
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
def compute_cross_session_accuracy(
    trial_results_table_2d: pd.DataFrame,
    trial_results_table_3d: pd.DataFrame,
    sampling_mode: str,
    gender_suffix: str = "",
) -> None:
    """
    Compute the cross-session accuracy.

    !!! question
        "Can one predict from trials in the 2D-condition trials in the 3D-condition?"
    In case there are multiple samples of the same tripled ID, the most frequent choice is taken.

    ??? note "Minimal impact of the random choices"
        The partially random choices during the comparison between the two viewing conditions (below) can lead to
        slightly different results in the `'match'` column, when this would be run again.
        Nonetheless, the contribution of these random choices should be small,
        since they should cancel each other out (no-match vs. match) in terms of their impact on the overall accuracy.

    :param trial_results_table_2d: trial results table of the 2D-session
    :param trial_results_table_3d: trial results table of the 3D-session
    :param sampling_mode: "full-sample" or "multi-sub-sample"
    :param gender_suffix: gender suffix "female" or "male" (if applicable) else empty string ""
    """
    # Check triplet IDs
    if set(trial_results_table_2d.triplet) != set(trial_results_table_3d.triplet):
        msg = "Triplets (ID's) must match between tables!"
        raise ValueError(msg)

    if not np.all(
        trial_results_table_2d.sort_values(by=["triplet_id"]).triplet.unique()
        == trial_results_table_3d.sort_values(by=["triplet_id"]).triplet.unique()
    ):
        msg = "Triplet ID to triplet mapping must match between tables!"
        raise ValueError(msg)

    # Check sampling_mode
    sampling_mode = check_sampling_mode(sampling_mode=sampling_mode)
    suffix = check_gender_suffix(suffix=gender_suffix)

    # First create a choice table per session (2D, 3D)
    path_to_choice_table = Path(
        paths.results.main.behavior, f"compare_choices_between_sessions_{sampling_mode}{suffix}.csv"
    )
    if path_to_choice_table.exists():
        choice_table = pd.read_csv(path_to_choice_table, index_col="triplet_id", low_memory=False)

    else:
        choice_table = pd.DataFrame(
            columns=["head_odd_2D", "head_odd_3D", "match"],
            index=np.sort(trial_results_table_2d.triplet_id.unique()),
        )
    choice_table.index.name = "triplet_id"

    # Fill choice table via a majority vote per triplet-ID
    new_entries = False

    for tid in tqdm(
        choice_table.index,
        desc=f"Filling choice table with {sampling_mode}{suffix} data",
        total=len(choice_table),
        colour="#63B456",
    ):
        if not pd.isna(choice_table.loc[tid]).all():
            continue
        new_entries = True

        # Get triplet table for current triplet ID
        tid_tab_2d = trial_results_table_2d.loc[trial_results_table_2d.triplet_id == tid, "head_odd"]
        tid_tab_3d = trial_results_table_3d.loc[trial_results_table_3d.triplet_id == tid, "head_odd"]

        # Get value counts in both conditions
        tid_tab_2d_vc = tid_tab_2d.value_counts()
        tid_tab_3d_vc = tid_tab_3d.value_counts()

        # Keep only max count values
        tid_tab_2d_vc = tid_tab_2d_vc[tid_tab_2d_vc == tid_tab_2d_vc.max()]
        tid_tab_3d_vc = tid_tab_3d_vc[tid_tab_3d_vc == tid_tab_3d_vc.max()]

        if len(tid_tab_2d_vc) > 1 or len(tid_tab_3d_vc) > 1:
            # At least two heads were chosen n times

            # Fill choice table with choices
            choice_table.loc[tid, "head_odd_2D"] = sorted(tid_tab_2d_vc.index.astype(int).tolist())
            choice_table.loc[tid, "head_odd_3D"] = sorted(tid_tab_3d_vc.index.astype(int).tolist())

            if choice_table.loc[tid, "head_odd_2D"] == choice_table.loc[tid, "head_odd_3D"]:
                # Case 1: The most chosen heads are equally distributed across viewing conditions
                choice_table.loc[tid, "match"] = True
                continue

            if len(choice_table.loc[tid, "head_odd_2D"]) == len(choice_table.loc[tid, "head_odd_3D"]):
                # Case 2: In both conditions we have the same number of most chosen heads, but the heads are not equal
                # We randomly draw from the most selected heads & compare them
                choice_table.loc[tid, "match"] = np.random.choice(tid_tab_2d_vc.index) == np.random.choice(
                    tid_tab_3d_vc.index
                )
                continue

            # Case 3: The most chosen heads are not equally distributed across viewing conditions
            # We randomly draw from the most selected heads & compare them
            choice_table.loc[tid, "match"] = np.random.choice(tid_tab_2d_vc.index) == np.random.choice(
                tid_tab_3d_vc.index
            )
        else:
            choice_table.loc[tid, "head_odd_2D"] = int(tid_tab_2d_vc.index[0])
            choice_table.loc[tid, "head_odd_3D"] = int(tid_tab_3d_vc.index[0])
            choice_table.loc[tid, "match"] = (
                choice_table.loc[tid, "head_odd_2D"] == choice_table.loc[tid, "head_odd_3D"]
            )

    # Save choice table if there are new entries
    if new_entries:
        print("There are new entries in the choice_table which will be saved")
        path_to_choice_table.parent.mkdir(parents=True, exist_ok=True)
        choice_table.to_csv(path_to_choice_table)
    else:
        print("There are no new entries in the choice_table.")

    # Get & fill MEA table
    mea_df = get_mea_table()
    mea_df.loc[("both", f"{sampling_mode}{suffix}"), :] = (
        choice_table.match.astype(int).mean(),
        np.minimum(
            trial_results_table_2d.triplet_id.value_counts().min(),
            trial_results_table_3d.triplet_id.value_counts().min(),
        ),
    )

    cprint(
        string=f"\n{sampling_mode.title()}{suffix}: Cross-session accuracy (using the majority vote in "
        f"triplets with multiple samples): {choice_table.match.astype(int).mean():.2%}",
        col="g",
    )

    # Save MEA table
    save_mea_table(mea_df=mea_df)

compute_maximal_empirical_accuracy_in_session 🗿

compute_maximal_empirical_accuracy_in_session(
    session: str,
    tr_table_of_session: DataFrame,
    sampling_mode: str,
    gender_suffix: str = "",
) -> None

Compute the maximal empirical accuracy (MEA) for a given session.

Parameters:

Name Type Description Default
session str

"2D" or "3D"

required
tr_table_of_session DataFrame

trial results table of session

required
sampling_mode str

"full-sample" or "multi-sub-sample"

required
gender_suffix str

gender suffix "female" or "male" (if applicable) else empty string ""

''
Source code in code/facesim3d/modeling/compute_noise_ceiling.py
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
def compute_maximal_empirical_accuracy_in_session(
    session: str, tr_table_of_session: pd.DataFrame, sampling_mode: str, gender_suffix: str = ""
) -> None:
    """
    Compute the maximal empirical accuracy (`MEA`) for a given session.

    :param session: "2D" or "3D"
    :param tr_table_of_session: trial results table of session
    :param sampling_mode: "full-sample" or "multi-sub-sample"
    :param gender_suffix: gender suffix "female" or "male" (if applicable) else empty string ""
    """
    # Checks
    sampling_mode = check_sampling_mode(sampling_mode=sampling_mode)
    suffix = check_gender_suffix(suffix=gender_suffix)

    # Get MEA table
    mea_df = get_mea_table()

    # Get value counts of triplets in table
    tr_val_ctn_table = (
        tr_table_of_session[["triplet_id", "triplet"]]
        .value_counts()
        .rename_axis(["triplet_id", "triplet"])
        .reset_index(name="counts")
    )

    # Compute max empirical accuracy for each triplet
    # Note for the multi-subsampling set, the triplet_id is different from previous sets
    min_n_samples = MIN_N_SAMPLES_PER_TRIPLET
    while True:
        poss_accs = []  # init
        len_trips = []  # init
        for _triplet, multi_triplet_tr in tr_table_of_session[
            tr_table_of_session.triplet.isin(
                tr_val_ctn_table[  # take triplets sampled multiple times
                    tr_val_ctn_table.counts >= min_n_samples
                ].triplet
            )
        ].groupby("triplet"):
            poss_accs.append(compute_maximal_empirical_accuracy_in_triplet(choices=multi_triplet_tr.head_odd.values))
            len_trips.append(len(multi_triplet_tr.head_odd.values))

        if len(poss_accs) == 0:
            break

        # Weighted aggregation of max empirical accuracy over triplets
        max_acc = np.sum(np.array(poss_accs) * np.array(len_trips)) / np.sum(len_trips)
        cprint(
            string=f"\n{sampling_mode.title()} {session}{suffix}: "
            f"Maximal empirical accuracy over {len(len_trips)} "
            f"triplets (with {min_n_samples} or more samples): {max_acc:.2%}",
            col="g",
        )
        mea_df.loc[(session, f"{sampling_mode}{suffix}"), :] = max_acc, min_n_samples

        if not CHECK_MORE_SAMPLES:
            break
        min_n_samples += 1

    # Save MEA table
    save_mea_table(mea_df=mea_df)

compute_maximal_empirical_accuracy_in_triplet 🗿

compute_maximal_empirical_accuracy_in_triplet(
    choices: ndarray,
) -> float | float64

Compute the maximal empirical accuracy (MEA) within a resampled triplet.

Here, MEA is defined as the accuracy, when the most frequent choice across all samples of the given triplet would always be predicted.

Parameters:

Name Type Description Default
choices ndarray

array of choices for the same triplet

required

Returns:

Type Description
float | float64

maximal empirical accuracy

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
130
131
132
133
134
135
136
137
138
139
140
141
def compute_maximal_empirical_accuracy_in_triplet(choices: np.ndarray) -> float | np.float64:
    """
    Compute the maximal empirical accuracy (`MEA`) within a resampled triplet.

    Here, `MEA` is defined as the accuracy, when the most frequent choice across all samples of the given
    triplet would always be predicted.

    :param choices: array of choices for the same triplet
    :return: maximal empirical accuracy
    """
    _, ctns = np.unique(choices, return_counts=True)  # _ = vals
    return np.max(ctns) / np.sum(ctns)  # max_accuracy

get_mea_table 🗿

get_mea_table()

Get the maximal empirical accuracy (MEA) table.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
76
77
78
79
80
81
82
83
def get_mea_table():
    """Get the maximal empirical accuracy (`MEA`) table."""
    if PATH_TO_ACC_TABLE.exists():
        mea_df = pd.read_csv(PATH_TO_ACC_TABLE, index_col=["session", "sample_type"])
    else:
        mea_df = pd.DataFrame(columns=["session", "sample_type", "max_acc", "min_n_samples"])
        mea_df = mea_df.set_index(["session", "sample_type"])
    return mea_df

get_mer_table 🗿

get_mer_table()

Get the maximal empirical R (MER) table.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
86
87
88
89
90
91
92
93
94
95
96
def get_mer_table():
    """Get the maximal empirical R (`MER`) table."""
    if PATH_TO_R_TABLE.exists():
        mer_df = pd.read_csv(PATH_TO_R_TABLE, index_col=["session", "sample_type"])
    else:
        mer_df = pd.DataFrame(
            columns=["session", "sample_type", "min_r", "mean_r", "max_r", "var_r", "max_p_value", "min_n_samples"]
        )
        mer_df = mer_df.set_index(["session", "sample_type"])

    return mer_df

get_r_stats 🗿

get_r_stats(ls_r: list) -> tuple

Get min, mean, max, and variance from a list of correlation coefficients.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
489
490
491
def get_r_stats(ls_r: list) -> tuple:
    """Get min, mean, max, and variance from a list of correlation coefficients."""
    return np.min(ls_r), np.mean(ls_r), np.max(ls_r), np.var(ls_r)

get_trial_tables 🗿

get_trial_tables(
    multi_sub_sample_only: bool = True,
) -> dict

Get the trial results table for each session.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
def get_trial_tables(multi_sub_sample_only: bool = True) -> dict:
    """Get the trial results table for each session."""
    trial_table_dict = {}  # init dict to store trial results tables
    if multi_sub_sample_only:
        for session in params.SESSIONS:
            trial_table_dict[session] = read_trial_results_of_set(
                set_nr=f"{session[0]}.20", clean_trials=True, verbose=False
            )
            # Note that in 2D-table there are more than 5 samples of the following triplet-IDs:
            #
            # > ID:     COUNT
            #   -------------
            #   118:    13
            #   419:    12
            #   803:    12
            #   643:    11
            #   1018:   11
            #   331:    11
            #   62:     11
            #   -------------
            # > SUM:    46
            #
            # This leads to 5,746 trials in total, whereas in the 3D-condition we have the expected 5,700 trials.

    else:
        for session in params.SESSIONS:
            trial_table_dict[session] = read_trial_results_of_session(
                session=session,
                clean_trials=True,
                drop_subsamples=True,  # here we remove the additionally acquired sub-sample from the data
                verbose=False,
            )

    # Proces data
    for session in params.SESSIONS:
        # Select columns
        trial_table_dict[session] = trial_table_dict[session][
            ["triplet_id", "triplet", "head1", "head2", "head3", "head_odd"]
        ]
        # Convert column types
        trial_table_dict[session] = trial_table_dict[session].astype(
            {
                "triplet_id": int,
                "head1": int,
                "head2": int,
                "head3": int,
                "head_odd": int,
            }
        )

    return trial_table_dict

save_mea_table 🗿

save_mea_table(mea_df: DataFrame)

Save the maximal empirical accuracy (MEA) table.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
 99
100
101
102
103
104
105
106
107
108
def save_mea_table(mea_df: pd.DataFrame):
    """Save the maximal empirical accuracy (`MEA`) table."""
    # Prepare the path
    PATH_TO_ACC_TABLE.parent.mkdir(parents=True, exist_ok=True)

    # Round values
    mea_df["max_acc"] = mea_df["max_acc"].round(4)

    # Save
    mea_df.to_csv(PATH_TO_ACC_TABLE)

save_mer_table 🗿

save_mer_table(mer_df: DataFrame)

Save the maximal empirical R (MER) table.

Source code in code/facesim3d/modeling/compute_noise_ceiling.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
def save_mer_table(mer_df: pd.DataFrame):
    """Save the maximal empirical R (`MER`) table."""
    # Prepare the path
    PATH_TO_R_TABLE.parent.mkdir(parents=True, exist_ok=True)

    # Sort index
    mer_df = mer_df.sort_index()
    print(mer_df[["mean_r"]])  # ['min_r', 'mean_r', 'max_r']

    # Round values
    mer_df[["min_r", "mean_r", "max_r"]] = mer_df[["min_r", "mean_r", "max_r"]].round(4)
    mer_df[["var_r"]] = mer_df[["var_r"]].round(6)
    mer_df[["max_p_value"]] = mer_df[["max_p_value"]].round(8)

    # Save
    cprint(string="\nSaving empirical R df ...", col="b")
    mer_df.to_csv(PATH_TO_R_TABLE)

statistical_difference_maximal_empirical_accuracy_between_sessions 🗿

statistical_difference_maximal_empirical_accuracy_between_sessions(
    tr_table_of_2d: DataFrame, tr_table_of_3d: DataFrame
) -> None

Run a significance test of differences between maximal empirical accuracies (MEA) for the two sessions (2D, 3D).

Parameters:

Name Type Description Default
tr_table_of_2d DataFrame

trial results table of 2D session

required
tr_table_of_3d DataFrame

trial results table of 2D session

required
Source code in code/facesim3d/modeling/compute_noise_ceiling.py
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
def statistical_difference_maximal_empirical_accuracy_between_sessions(
    tr_table_of_2d: pd.DataFrame,
    tr_table_of_3d: pd.DataFrame,
) -> None:
    """
    Run a significance test of differences between maximal empirical accuracies (MEA) for the two sessions (2D, 3D).

    :param tr_table_of_2d: trial results table of 2D session
    :param tr_table_of_3d: trial results table of 2D session
    """
    poss_accs_dict = {}  # init
    len_trips_dict = {}  # init
    for sess in params.SESSIONS:
        tr_table_of_session = {"2D": tr_table_of_2d, "3D": tr_table_of_3d}[sess]

        # Get value counts of triplets in table
        tr_val_ctn_table = (
            tr_table_of_session[["triplet_id", "triplet"]]
            .value_counts()
            .rename_axis(["triplet_id", "triplet"])
            .reset_index(name="counts")
        )

        # Compute max empirical accuracy for each triplet
        # Note for the multi-subsampling set, the triplet_id is different from previous sets
        poss_accs = []  # init
        len_trips = []  # init
        for _triplet, multi_triplet_tr in tr_table_of_session[
            tr_table_of_session.triplet.isin(
                tr_val_ctn_table[  # take triplets sampled multiple times
                    tr_val_ctn_table.counts >= MIN_N_SAMPLES_PER_TRIPLET
                ].triplet
            )
        ].groupby("triplet"):
            poss_accs.append(compute_maximal_empirical_accuracy_in_triplet(choices=multi_triplet_tr.head_odd.values))
            len_trips.append(len(multi_triplet_tr.head_odd.values))

        # Fill dict
        poss_accs_dict[sess] = poss_accs
        len_trips_dict[sess] = len_trips

    # Run significance test:
    is_normal_dist = True  # init
    for sess in params.SESSIONS:
        # Test for normal distribution
        a_d_test = sm.stats.diagnostic.normal_ad(np.array(poss_accs_dict[sess]))
        if a_d_test[1] < 0.05 / 2:  # Bonferroni-corrected
            cprint(
                string=f"Maximal empirical accuracy (MEA) of {sess} is not normally distributed "
                f"(Anderson-Darling test: {a_d_test[0]:.2f}, p-value={a_d_test[1]:.2g})",
                col="r",
            )
            is_normal_dist = False
    if is_normal_dist:  # not the case
        t_stat, p_val, df = sm.stats.ttest_ind(
            poss_accs_dict["2D"], poss_accs_dict["3D"], alternative="two-sided", usevar="pooled"
        )
        cprint(
            string=f"\nStatistical difference between maximal empirical accuracies (MEA) of 2D & 3D: "
            f"t-statistic(df={df})={t_stat:.2f}, p-value={p_val:.2g}",
            col="g",
        )
    else:
        # Stats comparison for non-normal distributions
        # The Mann-Whitney U test is a non-parametric version of the t-test for independent samples.
        u2d, p_val = mannwhitneyu(poss_accs_dict["2D"], poss_accs_dict["3D"], alternative="two-sided", method="auto")
        nx, ny = len(poss_accs_dict["2D"]), len(poss_accs_dict["3D"])
        assert nx == ny  # noqa: S101
        u3d = nx * ny - u2d

        u_test = np.minimum(u2d, u3d)

        # Calculate effect size
        # 1. Pearson's r (however, not applicable for non-normal distributions)
        se = np.sqrt(nx * ny * (nx + ny + 1) / 12)
        e_u = nx * ny / 2  # expected value
        z = (u_test - e_u) / se  # z-score
        pearson_r = z / np.sqrt(nx + ny)

        # 2. Probability of superiority (better for non-normal distributions)
        ps = u_test / (nx * ny)

        cprint(
            string=f"\nStatistical difference between maximal empirical accuracies (MEA) of 2D & 3D (both N={nx}): "
            f"Mann-Whitney U-Test={u_test:.1f}, Z={z:.2f}, p-value={p_val:.2g}; "
            f"effect size: r={pearson_r:.2f}, ps={ps:.2%}.",
            col="g",
        )