Current raw-data-loading method on Montgomery, Shenzhen and (probably) Indian datasets are likely cropping lungs
By visual inspection, I saw some of the resulting images after the centre-cropping proposed by our stock raw-data loader. In some cases, the lungs are (at least) touching the margins of the cropped area.
The ideal fix for this would be to include bounding boxes with each sample in the dataset, and crop the images so that there is a configurable margin around the lungs. Montgomery and Shenzhen have annotated lung masks, out of which some bounding box may be output and saved into a file. The "Indian" (aka Dataset A/Dataset B or DA/DB datasets) don't contain such annotations, nor the images from TBX11k.
@mrenzo @arenzom: A couple of questions:
- We analysed the performance of the "area around the lungs" for follow-up segmentation. What was the "best" margin we got on that study? I could not find any information on this subject.
- What is the tool you are using to make annotations on ChexPhoto?