-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Vcc2018 #611
base: master
Are you sure you want to change the base?
WIP: Vcc2018 #611
Conversation
SupervisionSet contains 4 supervisions that start at 0 for recording XY for all recordings because each recording has four MOS supervisions. Plus there are 291 recordings which do not have suppervisions - TODO investigate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, we missed reviewing this earlier. I have made some comments.
@@ -56,3 +56,4 @@ | |||
from .voxceleb import * | |||
from .wenet_speech import * | |||
from .yesno import * | |||
from .vcc2018 import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to keep the imports sorted in lexicographical order. Could you put this import above vctk
?
mos_scores = mos_dir / "vcc2018_evaluation_mos.txt" | ||
assert mos_scores.is_file() | ||
sim_scores = mos_dir / "vcc2018_evaluation_sim.txt" | ||
sim_scores.is_file() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be an assert
?
f"Collecting reference target recordings for the VCC2018 challange from {reference_speech_dir}" | ||
) | ||
|
||
# TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed?
return {"recordings": recordings, "supervisions": supervisions} | ||
|
||
|
||
def prepare_mos_supervisions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is only intended to be used inside this module, it is recommended to add a single underscore, i.e., _prepare_mos_supervisions()
. Note that this does not enforce privacy but only indicates that this method should not be called directly.
return SupervisionSet.from_segments(supervisions) | ||
|
||
|
||
def load_vcc_results(path: Pathlike): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same _load_vcc_results()
""" | ||
Returns pandas.DataFrame | ||
""" | ||
# """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use either multi-line comment (""" """) or single-line (#)
recording_ids = set(mos["left_audio"].tolist()) | ||
supervisions = [] | ||
for recording_id_wav in tqdm(recording_ids, desc="Supervision creation"): | ||
recording_id = recording_id_wav.rstrip(".wav") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you make the recording_ids
above as a set of Path
types, then you could use .stem
here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know… but it is less readible with Path(recording_id_wav).stem
since I literally need to use both recording_id wit hand without “.wav” for the original data and Lhotse use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recording_id_wav.name
would give you the file name with the extension.
def prepare_mos_supervisions( | ||
mos_results_path, recordings: RecordingSet, id2trn: Dict[str, str] | ||
) -> SupervisionSet: | ||
# TODO very slow -> make it faster it takes ~8min 170it/s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be because you are manipulating a Pandas dataframe. Pandas does a lot of book-keeping under the hood which makes it good for complicated operations, but here I think it may be better to just use something like a list of namedtuples, and then use groupby
to group them.
@desh2608 I forgot about this WIP PR. I will address your comments next week. |
No description provided.