Next Steps @Lynn
- Replicate the 29 biomes for data2 that we have for data1
- Time series forecasting for all biomes for data1 and data2
- Keep track of which biomes have more/less data
- Check if qnet for data2 is generated at higher p-value
- Next levels of microbiome organization: Genus
Organize Pipeline @Alice
Write python classes for the following:
- Data formatting
- Quantization
- Qnet generation
- Masking check
- Forecast generation
The objective here is to go from a dataset like here: https://raw.githubusercontent.com/zeroknowledgediscovery/course_notes/master/datasets/MuBIOME_/data_1/SamplesByMetadata_otuDADA2_DIABIMMUNE_RSRC_TaxaRelativeAbundance.csv
import zbiome as zb
zb.getdata(filepath,tax='phylum')
zb.quantize(numlevels=5)
zb.qnet(significance_level=0.9)
df=zb.masked_runs(biomes='all',mask_percentage=10)
ef=zb.forecast(biomes='all',sampleid='all',observation_periods=4)
zb.generate_hypothesis(time,causality_window)
