3. Module Run

The class Reader has been designed to selectively extract data from a mzML file and to expose the data as a python object. Necessary information are read in and stored in a fast accessible format. The reader itself is an iterator, thus looping over all spectra follows the classical pythonian syntax. Additionally one can random access spectra by their nativeID if the file if not compressed or truncated by a conversion Program.

The class Writer is still in development.

class run.Reader
[,noiseThreshold=0.0, extraAccessions=None, MS1_Precision=5e-6, MSn_Precision=20e-6]


Initializes an mzML run and returns an iterator.

  • path (string) – path to mzML file. File can be gzipped.
  • extraAccessions (list of tuples) –

    list of additional (accession,fieldName) tuples.

    For example, (‘MS:1000285’,[‘value’]) will extract the “total ion current” and store it under two keys in the spectrum, i.e. spectrum[“total ion current”] or spectrum[‘MS:1000285’].

    The translated name is extracted from the current OBO file, hence the name that is defined by the HUPO-PSI consortium is used. (http://www.psidev.info/).

    pymzML comes with an example script queryOBO.py which can be used to lookup the names or MS tags (see: queryOBO).

    The value, i.e. which xml property has to be extraced has to be provided by the user. Multiple values can be used as input, i.e. ( ‘MS:1000016’ , [‘value’,’unitName’] ) will extract scan time and its unit.

  • MS1_Precision (float) – measured precision of MS1 spectra
  • MSn_Precision (float) – measured precision of MSn spectra
  • build_index_from_scratch (boolean) – build index from scratch
  • file_object (File_object like) – file object or any other iterable stream, this will make path obsolete, seeking is disabled


>>> run = pymzml.run.Reader("../mzML_example_files/100729_t300_100729172744.mzML.gz",
                            MS1_Precision = 20e-6)

The python 2.6+ iterator


Iterator in class Run:

will return an instance of spec.Spectrum, stored in run.spectrum.


>>> for spectrum in run:
...     print(spectrum['id'], end='\r')

Random access to spectra if mzML fill is indexed, not compressed and not truncted.


>>> spectrum_with_nativeID_100 = msrun[100]
class run.Writer
__init__(filename*, run*[, overwrite = boolean])

Initializes an mzML writer (beta stage).

  • path (string) – filename for the new mzML file.
  • run (pymzml.run.Reader) – Currently a pymzml.run.Reader object is required since we do not write the header by ourselves yet.
  • overwrite (boolean) – force the re-initialization of mzML file, even if file exists.


>>> run = pymzml.run.Reader(
...     '../mzML_example_files/100729_t300_100729172744.mzML',
...     MS1_Precision=5e-6,
... )
>>> run2 = pymzml.run.Writer(filename='write_test.mzML', run=run , overwrite=True)
>>> spec = run[1000]
>>> run2.addSpec(spec)
>>> run2.save()