How Can I Read In Values From A Text File And Calculate How Many Times A Value Repeats And Then Find The Average?
I have a text file called text.txt which looks like this: 5.H6 7.891 0.3 6.H6 7.693 0.3 7.H8 8.16859 0.3 8.H6 7.446 0.3 5.H6 7.72158 0.3 9.H8 8.1053 0.3 8.H6 7.65014 0.3 10.H6 7.5
Solution 1:
With df
as:
Atom ppm unclear
0 5.H6 7.89100 0.3
1 6.H6 7.69300 0.3
2 7.H8 8.16859 0.3
3 8.H6 7.44600 0.3
4 5.H6 7.72158 0.3
5 9.H8 8.10530 0.3
6 8.H6 7.65014 0.3
7 10.H6 7.54000 0.3
8 12.H6 8.06700 0.3
9 13.H6 8.04700 0.3
10 14.H6 7.69624 0.3
11 6.H6 7.70272 0.3
12 17.H8 7.16900 0.3
13 16.H8 8.27957 0.3
14 18.H6 7.38500 0.3
15 19.H8 7.65700 0.3
16 20.H8 7.78512 0.3
17 21.H8 8.06057 0.3
Use groupby()
to collect information per-Atom
, then apply aggregation functions as desired:
gb = (df.groupby("Atom", as_index=False)
.agg({"ppm":["count","mean"]})
.rename(columns={"count":"nVa", "mean":"avgppm"}))
gb.head()
Atom ppm
nVa avgppm
010.H6 17.54000112.H6 18.06700213.H6 18.04700314.H6 17.69624416.H8 18.27957
That gives the workflow for grouping and aggregating, but it's not quite in the format you requested. We can drop the multi-level column structure, although it's not strictly necessary to compute the values you're interested in:
gb.columns = gb.columns.droplevel()
gb = gb.rename(columns={"":"Atom"})
Atom nVa avgppm
010.H6 17.54000112.H6 18.06700213.H6 18.04700314.H6 17.69624416.H8 18.27957517.H8 17.16900618.H6 17.38500719.H8 17.65700820.H8 17.78512921.H8 18.06057105.H6 27.80629116.H6 27.69786127.H8 18.16859138.H6 27.54807149.H8 18.10530
See groupby()
docs for a full treatment.
Post a Comment for "How Can I Read In Values From A Text File And Calculate How Many Times A Value Repeats And Then Find The Average?"