{
    "info": {
        "author": "Dr. Jan-Philip Gehrcke",
        "author_email": "jgehrcke@googlemail.com",
        "bugtrack_url": null,
        "classifiers": [
            "License :: OSI Approved :: MIT License",
            "Operating System :: POSIX",
            "Programming Language :: Python",
            "Programming Language :: Python :: 3.6",
            "Programming Language :: Python :: 3.7",
            "Programming Language :: Python :: Implementation :: CPython"
        ],
        "description": "# Overview\n\nMeasures the resource utilization of a specific process over time.\n\nAlso measures the utilization/saturation of system-wide resources: this helps putting the process-specific metrics into context.\n\nBuilt for Linux. Windows and Mac OS support might come.\n\nFor a list of the currently supported metrics see [below](#measurands).\n\nThe name, [G\u00f6ffel](https://de.wikipedia.org/wiki/Essbesteck#Mischformen), is German for [spork](https://en.wikipedia.org/wiki/Spork):\n\n![image of a spork](docs/figs/spork.jpg?raw=true \"image of spork / G\u00f6ffel\")\n\nConvenient, right?\n\n## Highlights\n\n- High sampling rate: the default sampling interval of `0.5 s` makes narrow spikes visible.\n- Can monitor a program subject to process ID changes (for longevity experiments where the monitored process occasionally restarts, for instance as of fail-over scenarios).\n- Can run indefinitely. Has predictable disk space requirements (output file rotation and retention policy).\n- Keeps your data organized: the time series data is written into a structured [HDF5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) file annotated with relevant metadata (also including program invocation time, system hostname, a custom label, the Goeffel software version, and others).\n- Interoperability: output files can be read with any HDF5 reader such as [PyTables](https://www.pytables.org) and especially with [pandas.read_hdf()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_hdf.html). See [tips and tricks](#tips-and-tricks).\n- Values measurement correctness very highly (see [technical notes](#technical-notes)).\n- Comes with a data plotting tool separate from the data acquisition program.\n\n# Download & installation\n\nThe latest Goeffel release can be downloaded and installed from PyPI, via pip:\n\n```\n$ pip install goeffel\n```\n\npip can also install the latest development version of Goeffel:\n\n```\n$ pip install git+https://github.com/jgehrcke/goeffel\n```\n\n# CLI tutorial\n\n## `goeffel`: data acquisition\n\nInvoke Goeffel with the `--pid <pid>` argument if the process ID of the target process is known.\nIn this mode, `goeffel` stops the measurement and terminates itself once the process with the given ID goes away. Example:\n\n```text\n$ goeffel --pid 29019\n\n[... snip ...]\n\n190809-15:46:57.914 INFO: Updated HDF5 file: wrote 20 sample(s) in 0.01805 s\n\n[... snip ...]\n\n190809-15:56:13.842 INFO: Cannot inspect process: process no longer exists (pid=29019)\n190809-15:56:13.843 INFO: Wait for producer buffer to become empty\n190809-15:56:13.843 INFO: Wait for consumer process to terminate\n190809-15:56:13.854 INFO: Updated HDF5 file: wrote 13 sample(s) in 0.01077 s\n190809-15:56:13.856 INFO: Sample consumer process terminated\n```\n\n\nFor measuring beyond the process lifetime use `--pid-command <command>`.\nIn the following example, I use the [pgrep](https://linux.die.net/man/1/pgrep) utility is for discovering the newest [stress](https://linux.die.net/man/1/stress) process:\n\n```text\n$ goeffel --pid-command 'pgrep stress --newest'\n\n[... snip ...]\n\n190809-15:47:47.337 INFO: New process ID from PID command: 25890\n\n[... snip ...]\n\n190809-15:47:57.863 INFO: Updated HDF5 file: wrote 20 sample(s) in 0.01805 s\n190809-15:48:06.850 INFO: Cannot inspect process: process no longer exists (pid=25890)\n190809-15:48:06.859 INFO: PID command returned non-zero\n\n[... snip ...]\n\n190809-15:48:09.916 INFO: PID command returned non-zero\n190809-15:48:10.926 INFO: New process ID from PID command: 28086\n190809-15:48:12.438 INFO: Updated HDF5 file: wrote 20 sample(s) in 0.01013 s\n190809-15:48:22.446 INFO: Updated HDF5 file: wrote 20 sample(s) in 0.01062 s\n\n[... snip ...]\n```\nIn this mode, `goeffel` runs forever until manually terminated via `SIGINT` or `SIGTERM`.\nProcess ID changes are detected by periodically running the discovery command until it returns a valid process ID on stdout.\nThis is useful for longevity experiments where the monitored process occasionally restarts, for instance as of fail-over scenarios.\n\n## `goeffel-analysis`: data inspection and visualization\n\n**Note**: `goeffel-analysis` provides an opinionated and limited approach to visualizing data. For advanced and thorough data analysis I recommend building a custom (maybe even ad-hoc) data analysis pipeline using `pandas` and `matplotlib`, or using the tooling of your choice.\n\n**Also note**: The command line interface provided by `goeffel-analysis`,\nespecially for the plot commands, might change in the future. Suggestions for\nimprovement are welcome, of course.\n\n### `goeffel-analysis inspect`:\n\nUse `goeffel-analysis inspect <path-to-HDF5-file>` for inspecting the contents\nof a Goeffel output file. Example:\n\n```text\n$ goeffel-analysis inspect mwst18-master1-journal_20190801_111952.hdf5\nMeasurement metadata:\n  System hostname: int-master1-mwt18.foo.bar\n  Invocation time (local): 20190801_111952\n  PID command: pgrep systemd-journal\n  PID: None\n  Sampling interval: 1.0 s\n\nTable properties:\n  Number of rows: 24981\n  Number of columns: 38\n  Number of data points (rows*columns): 9.49E+05\n  First row's (local) time: 2019-08-01T11:19:53.613377\n  Last  row's (local) time: 2019-08-01T18:52:49.954582\n  Time span: 7h 32m 56s\n\nColumn names:\n  unixtime\n  ... snip ...\n  system_mem_inactive\n```\n\n### `goeffel-analysis plot`: quickly plot data from a single time series file\n\nThe `goeffel-analysis plot <path-to-hdf5-file>` command plots a pre-selected\nset of metrics in an opinionated way. More metrics can be added to the plot with\nthe `--metric <metric-name>` option. Example command:\n\n```bash\ngoeffel-analysis plot \\\n  mwst18-master2-mesosmaster_20190801_112136.hdf5 \\\n  --metric proc_num_ip_sockets_open\n```\nExample output figure:\n![goeffel-analysis plot example output image](https://raw.githubusercontent.com/jgehrcke/goeffel/0.2.0/docs/figs/analysis_magic_example_small.png \"goeffel-analysis plot example\")\n\n### `goeffel-analysis flexplot`: generic plot command\n\nThis command can be used for example for comparing multiple time series.\nSay you have monitored the same program across multiple replicas in a distributed system and would like to compare the time evolution of a certain metric across these replicas.\nThen the `goeffel-analysis flexplot` command is here to help, invoked with multiple `--series` arguments:\n\n```bash\n$ goeffel-analysis flexplot \\\n  --series mwst18-master1-journal_20190801_111952.hdf5 master1 \\\n  --series mwst18-master2-journal_20190801_112136.hdf5 master2 \\\n  --series mwst18-master3-journal_20190801_112141.hdf5 master3 \\\n  --series mwst18-master4-journal_20190801_112151.hdf5 master4 \\\n  --series mwst18-master5-journal_20190801_112157.hdf5 master5 \\\n  --column proc_cpu_util_percent_total \\\n      'CPU util (total) / %' \\\n      'systemd journal CPU utilization ' 15 \\\n  --subtitle 'MWST18, measured with Goeffel' \\\n  --legend-loc 'upper center'\n\n```\n\nExample output figure:\n![goeffel-analysis flexplot example output image](https://raw.githubusercontent.com/jgehrcke/goeffel/0.2.0/docs/figs/analysis_plot_example_small.png \"goeffel-analysis flexplot example\")\n\n\n# Background and details\n\n## Prior art\n\nThis was born out of a need for solid tooling. We started with [pidstat from\nsysstat](https://github.com/sysstat/sysstat/blob/master/pidstat.c), launched as\n`pidstat -hud -p $PID 1 1`. We found that it does not properly account for\nmultiple threads running in the same process and that various issues in that\nregard exist in this program across various versions (see\n[here](https://github.com/sysstat/sysstat/issues/73#issuecomment-349946051),\n[here](https://github.com/sysstat/sysstat/commit/52977c479), and\n[here](https://github.com/sysstat/sysstat/commit/a63e87996)).\n\nThe program [cpustat](https://github.com/uber-common/cpustat) open-sourced by\nUber has a delightful README about the general measurement methodology and\noverall seems to be a great tool. However, it seems to be optimized for\ninteractive usage (whereas we were looking for a robust measurement program\nwhich can be pointed at a process and then be left unattended for a significant\nwhile) and there does not seem to be a well-documented approach towards\npersisting the collected time series data on disk for later inspection.\n\nThe program [psrecord](https://github.com/astrofrog/psrecord) (which effectively\nwraps [psutil](https://psutil.readthedocs.io/en/latest/)) has a similar\nfundamental approach as Goeffel; it however only measures few metrics, and\nit does not have a clear separation of concerns between persisting the data to\ndisk, performing the measurement itself, and analyzing/plotting the data.\n\n## Technical notes\n\n- The core sampling loop does little work besides the measurement itself: it\n  writes each sample to a queue. A separate process consumes this queue and\n  persists the time series data to disk, for later inspection. This keeps the\n  sampling rate predictable upon disk write latency spikes, or generally upon\n  backpressure. This matters especially in cloud environments where we sometimes\n  see fsync latencies of multiple seconds.\n\n- The sampling loop is (supposed to be, feedback welcome) built so that\n  timing-related systematic measurement errors are minimized.\n\n- Goeffel tries to not asymmetrically hide measurement uncertainty. For example,\n  you might see it measure a CPU utilization of a single-threaded process\n  slightly larger than 100 %. That's simply the measurement error. In related\n  tooling such as `sysstat` it seems to be common practice to _asymmetrically_\n  hide measurement uncertainty by capping values when they are known to in\n  theory not exceed a certain threshold\n  ([example](https://github.com/sysstat/sysstat/commit/52977c479d3de1cb2535f896273d518326c26722)).\n\n- `goeffel` must be run with `root` privileges.\n\n- The value `-1` has a special meaning for some metrics\n  ([NaN](https://en.wikipedia.org/wiki/NaN), which cannot be represented\n  properly in HDF5). Example: A disk write latency of `-1 ms` means that no\n  write happened in the corresponding time interval.\n\n- The highest meaningful sampling rate is limited by the kernel's timer and\n  bookkeeping system.\n\n# Measurands\n\n[Measurand](https://en.wiktionary.org/wiki/measurand) is a word! This section\nattempts to describe the individual data columns (\"metrics\"), their units, and\ntheir meaning. There are four main categories:\n\n- [Timestamps](#timestamps)\n- [Process-specific metrics](#process-specific-metrics)\n- [Disk metrics](#disk-metrics)\n- [System-wide metrics](#system-wide-metrics)\n\n\n### Timestamps\n\n#### `unixtime`, `isotime_local`, `monotime`\n\nThe timestamp corresponding to the *right* boundary of the sampled time\ninterval.\n\n* `unixtime` encodes the wall time. It is a canonical Unix timestamp (seconds\n  since epoch, double-precision floating point number); with sub-second\n  precision and no timezone information. This is compatible with a wide range of\n  tooling and therefore the general-purpose timestamp column for time series\n  analysis (also see [How to convert the `unixtime` column into a\n  `pandas.DatetimeIndex`](#how-to-convert-the-unixtime-column-into-a-pandasdatetimeindex)).\n  **Note**: this is subject to system clock drift. In extreme case, this might\n  go backward, have discontinuities, and be a useless metric. In that case, the `monotime`\n  metric helps (see below).\n\n* `isotime_local` is a human-readable version of the same timestamp as stored in\n  `unixtime`. It is a 26 character long text representation of the *local* time\n  using an ISO 8601 notation (and therefore also machine-readable). Like\n  `unixtime` this metric is subject to system clock drift and might become\n  pretty useless in extreme cases.\n\n* `monotime` is based on a so-called\n  [monotonic](https://www.python.org/dev/peps/pep-0418/#id19) clock source that\n  is *not* subject to (accidental or well-intended) system clock drift. This\n  column encodes most accurately the relative time difference between any two\n  samples in the time series. The timestamps encoded in this column only make\n  sense relative to each other; the difference between any two values in this\n  column is a *wall time* difference in seconds, with sub-second precision.\n\n### Process-specific metrics\n\n#### `proc_pid`\n\nThe process ID of the monitored process. It can change if Goeffel was invoked\nwith the `--pid-command` option.\n\nMomentary state at sampling time.\n\n#### `proc_cpu_util_percent_total`\n\nThe CPU utilization of the process in `percent`.\n\nMean over the past sampling interval.\n\nIf the inspected process is known to contain just a single thread then this can\nstill sometimes be larger than 100 % as of measurement errors. If the process\nruns more than one thread then this can go far beyond 100 %.\n\nThis is based on the sum of the time spent in user space and in kernel space.\nFor a more fine-grained picture the following two metrics are also available:\n`proc_cpu_util_percent_user`, and `proc_cpu_util_percent_system`.\n\n#### `proc_cpu_id`\n\nThe ID of the CPU that this process is currently running on.\n\nMomentary state at sampling time.\n\n#### `proc_ctx_switch_rate_hz`\n\nThe rate of ([voluntary and\ninvoluntary](https://unix.stackexchange.com/a/442991)) context switches in `Hz`.\n\nMean over the past sampling interval.\n\n#### `proc_num_threads`\n\nThe number of threads in the process.\n\nMomentary state at sampling time.\n\n#### `proc_num_ip_sockets_open`\n\nThe number of sockets currently being open. This includes IPv4 and IPv6 and does\nnot distinguish between TCP and UDP, and the connection state also does not\nmatter.\n\nMomentary state at sampling time.\n\n#### `proc_num_fds`\n\nThe number of file descriptors currently opened by this process.\n\nMomentary state at sampling time.\n\n#### `proc_disk_read_throughput_mibps` and `proc_disk_write_throughput_mibps`\n\nThe disk I/O throughput of the inspected process, in `MiB/s`.\n\nBased on Linux' `/proc/<pid>/io` `rchar` and `wchar`. Relevant\n[Linux kernel documentation](https://github.com/torvalds/linux/blob/33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9/Documentation/filesystems/proc.txt#L1609) (emphasis mine):\n\n> `rchar`: The number of bytes which this task has caused to be read from\n> storage. This is simply the sum of bytes which this process passed to read()\n> and pread(). *It includes things like tty IO* and it is unaffected by whether\n> or not actual physical disk IO was required (*the read might have been\n> satisfied from pagecache*).\n\n> `wcar`: The number of bytes which this task has caused, or shall cause to be\n> written to disk. Similar caveats apply here as with rchar.\n\nMean over the past sampling interval.\n\n#### `proc_disk_read_rate_hz` and `proc_disk_write_rate_hz`\n\nThe rate of read/write system calls issued by the process as inferred from the\nLinux `/proc` file system. The relevant `syscr`/`syscw` counters are as of now\nonly documented with \"_read I/O operations, i.e. syscalls like read() and\npread()_\" and \"_write I/O operations, i.e. syscalls like write() and pwrite()_\".\nReference:\n[Documentation/filesystems/proc.txt](https://github.com/torvalds/linux/blob/33920f1ec5bf47c5c0a1d2113989bdd9dfb3fae9/Documentation/filesystems/proc.txt#L1628)\n\nMean over the past sampling interval.\n\n#### `proc_mem_rss_percent`\n\nFraction of process [resident set size](https://stackoverflow.com/a/21049737)\n(RSS) relative to the machine's physical memory size in `percent`. This is\nequivalent to what `top` shows in the `%MEM` column.\n\nMomentary state at sampling time.\n\n#### `proc_mem_rss`, `proc_mem_vms`. `proc_mem_dirty`\n\nVarious memory usage metrics of the monitored process. See the [psutil\ndocs](https://psutil.readthedocs.io/en/release-5.3.0/#psutil.Process.memory_info)\nfor a quick summary of what the values mean. However, note that the values need\ncareful interpretation, as shown by discussions like\n[this](https://serverfault.com/q/138427) and\n[this](https://serverfault.com/q/138427).\n\nMomentary snapshot at sampling time.\n\n\n### Disk metrics\n\nOnly collected if Goeffel is invoked with the `--diskstats <DEV>` argument. The\nresulting data column names contain the device name `<DEV>` (note however that\ndashes in `<DEV>` get removed when building the column names).\n\nNote that the conclusiveness of some of these disk metrics is limited. I believe\nthat [this blog post](https://blog.serverfault.com/2010/07/06/777852755/) nicely\ncovers a few basic Linux disk I/O concepts that should be known prior to read\na meaning into these numbers.\n\n#### `disk_<DEV>_util_percent`\n\nThis implements iostat's disk `%util` metric.\n\nI like to think of it as the ratio between the actual (wall) time elapsed in the\nsampled time interval, and the corresponding device's \"busy time\" in the very\nsame time interval, expressed in percent. The iostat documentation describes\nthis metric in the following words:\n\n> Percentage of elapsed time during which I/O requests were issued to the device\n> (bandwidth  utilization  for the device).\n\nThis is the mean over the sampling interval.\n\n**Note**: In the case of modern storage systems `100 %` utilization usually does\n**not** mean that the device is saturated. I would like to quote [Marc\nBrooker](https://brooker.co.za/blog/2014/07/04/iostat-pct.html):\n\n> As a measure of general IO busyness `%util` is fairly handy, but as an\n> indication of how much the system is doing compared to what it can do, it's\n> terrible.\n\n#### `disk_<DEV>_write_latency_ms` and `disk_<DEV>_read_latency_ms`\n\nThis implements iostat's `w_await` which is documented with\n\n> The average time (in milliseconds) for write requests issued to the device to\n> be served. This includes the time spent by the requests in queue and the time\n> spent servicing them.\n\nOn Linux, this is built using `/proc/diskstats` documented\n[here](https://github.com/torvalds/linux/blob/v5.3-rc3/Documentation/admin-guide/iostats.rst#io-statistics-fields).\nSpecifically, this uses field 8 (\"number of milliseconds spent writing\") and\nfield 5 (\"number of writes completed\"). Notably, the latter it is *not* the\nmerged write count but the user space write count (which seems to be what iostat\nuses for calculating `w_await`).\n\nThis can be a useful metric, but please be aware of its meaning and limitations.\nTo put this into perspective, in an experiment I have seen that the following\ncan happen within a second of real-time (observed via `iostat -x 1 | grep xvdh`\nand via direct monitoring of `/proc/diskstats`): 3093 userspace write requests\nserved, merged into 22 device write requests, yielding a total of 120914\nmilliseconds \"spent writing\", resulting in a mean write latency of 25 ms. But\nwhat do these 25 ms really mean here? On average, humans have less than two\nlegs, for sure. The current implementation method reproduces iostat output,\nwhich was the initial goal. Suggestions for improvement are very welcome.\n\nThis is the mean over the sampling interval.\n\nThe same considerations hold true for `r_await`, correspondingly.\n\n#### `disk_<DEV>_merged_read_rate_hz` and `disk_<DEV>_merged_write_rate_hz`\n\nThe _merged_ read and write request rate.\n\nThe Linux kernel attempts to merge individual user space requests before passing\nthem to the storage hardware. For non-random I/O patterns this greatly reduces\nthe rate of individual reads and writes issued to disk.\n\nBuilt using fields 2 and 6 in `/proc/diskstats` documented\n[here](https://github.com/torvalds/linux/blob/v5.3-rc3/Documentation/admin-guide/iostats.rst#io-statistics-fields).\n\nThis is the mean over the sampling interval.\n\n#### `disk_<DEV>_userspace_read_rate_hz` and `disk_<DEV>_userspace_write_rate_hz`\n\nThe read and write request rate issued from user space point of view (before\nmerges).\n\nBuilt using fields 1 and 5 in `/proc/diskstats` documented\n[here](https://github.com/torvalds/linux/blob/v5.3-rc3/Documentation/admin-guide/iostats.rst#io-statistics-fields).\n\nThis is the mean over the sampling interval.\n\n\n### System-wide metrics\n\n`system_loadavg1`\n\n`system_loadavg5`\n\n`system_loadavg15`\n\n`system_mem_available`\n\n`system_mem_total`\n\n`system_mem_used`\n\n`system_mem_free`\n\n`system_mem_shared`\n\n`system_mem_buffers`\n\n`system_mem_cached`\n\n`system_mem_active`\n\n`system_mem_inactive`\n\n\n# Tips and tricks\n\n## How to convert a Goeffel HDF5 file into a CSV file\n\nI recommend to de-serialize and re-serialize using\n[pandas](https://pandas.pydata.org/). Example one-liner:\n```\npython -c 'import sys; import pandas as pd; df = pd.read_hdf(sys.argv[1], key=\"goeffel_timeseries\"); df.to_csv(sys.argv[2], index=False)' goeffel_20190718_213115.hdf5.0001 /tmp/hdf5-as-csv.csv\n```\nNote that this significantly inflates the file size (e.g., from 50 MiB to 300\nMiB).\n\n## How to visualize and browse the contents of an HDF5 file\n\nAt some point, you might feel inclined to poke around in an HDF5 file created by\nGoeffel or to do custom data inspection/processing. In that case, I recommend\nusing one of the various available open-source HDF5 tools for managing and\nviewing HDF5 files. One GUI tool I have frequently used is\n[ViTables](http://vitables.org/). Install it with `pip install vitables` and\nthen do e.g.\n\n```text\nvitables goeffel_20190718_213115.hdf5\n```\n\nThis opens a GUI which allows for browsing the tabular time series data, for\nviewing the metadata in the file, for exporting data as CSV, for querying the\ndata, and various other things.\n\n## How to do quick data analysis using IPython and pandas\n\nI recommend to start an [IPython](https://ipython.org/) REPL:\n```text\npip install ipython  # if you have not done so yet\nipython\n```\nLoad the HDF5 file into a `pandas` data frame:\n```\nIn [1]: import pandas as pd\nIn [2]: df = pd.read_hdf('goeffel_timeseries__20190806_213704.hdf5', key='goeffel_timeseries')\n```\nFrom here you can do anything.\n\nFor example, let's have a look at the mean value of the actual sampling interval\nused in this specific Goeffel time series:\n```\nIn [3]: df['unixtime'].diff().mean()\nOut[3]: 0.5003192476604296\n```\n\nOr, let's see how many threads the monitored process used at most during the\nentire observation period:\n```\nIn [4]: df['proc_num_threads'].max()\nOut[4]: 1\n```\n\n## How to convert the `unixtime` column into a `pandas.DatetimeIndex`\n\nThe HDF5 file contains a `unixtime` column which contains canonical Unix\ntimestamp data ready to be consumed by a plethora of tools. If you are like me\nand like to use `pandas` then it is good to know how to convert this into a\nnative `pandas.DateTimeIndex`:\n\n```\nIn [1]: import pandas as pd\nIn [2]: df = pd.read_hdf('goeffel_timeseries__20190807_174333.hdf5', key='goeffel_timeseries')\n\n# Now the data frame has an integer index.\nIn [3]: type(df.index)\nOut[3]: pandas.core.indexes.numeric.Int64Index\n\n# Parse unixtime column.\nIn [4]: timestamps = pd.to_datetime(df['unixtime'], unit='s')\n\n# Replace the index of the data frame.\nIn [5]: df.index = timestamps\n\n# Now the data frame has a DatetimeIndex.\nIn [6]: type(df.index)\nOut[6]: pandas.core.indexes.datetimes.DatetimeIndex\n\n# Let's look at some values.\nIn [7]: df.index[:5]\nOut[7]:\nDatetimeIndex(['2019-08-07 15:43:33.798929930',\n               '2019-08-07 15:43:34.300590992',\n               '2019-08-07 15:43:34.801260948',\n               '2019-08-07 15:43:35.301798105',\n               '2019-08-07 15:43:35.802226067'],\n              dtype='datetime64[ns]', name='unixtime', freq=None)\n```\n\n\n# Valuable references\n\nExternal references on the subject matter that I found useful during\ndevelopment.\n\nAbout system performance measurement, and kernel time bookkeeping:\n\n- http://www.brendangregg.com/usemethod.html\n- https://www.vividcortex.com/blog/monitoring-and-observability-with-use-and-red\n- https://github.com/uber-common/cpustat/blob/master/README.md\n- https://elinux.org/Kernel_Timer_Systems\n- https://github.com/Leo-G/DevopsWiki/wiki/How-Linux-CPU-Usage-Time-and-Percentage-is-calculated\n\nAbout disk I/O statistics:\n\n- https://www.xaprb.com/blog/2010/01/09/how-linux-iostat-computes-its-results/\n- https://www.kernel.org/doc/Documentation/iostats.txt\n- https://blog.serverfault.com/2010/07/06/777852755/ (interpreting iostat output)\n- https://unix.stackexchange.com/a/462732 (What are merged writes?)\n- https://stackoverflow.com/a/8512978 (what is`%util` in iostat?)\n- https://brooker.co.za/blog/2014/07/04/iostat-pct.html\n- https://coderwall.com/p/utc42q/understanding-iostat\n- https://www.percona.com/doc/percona-toolkit/LATEST/pt-diskstats.html\n\nOthers:\n\n- https://serverfault.com/a/85481/121951 (about system memory statistics)\n\nMusings about HDF5:\n\n- https://cyrille.rossant.net/moving-away-hdf5/\n- http://hdf-forum.184993.n3.nabble.com/File-corruption-and-hdf5-design-considerations-td4025305.html\n- https://pytables-users.narkive.com/QH2WlyqN/corrupt-hdf5-files\n- https://www.hdfgroup.org/2015/05/whats-coming-in-the-hdf5-1-10-0-release/\n- https://stackoverflow.com/q/35837243/145400",
        "description_content_type": "text/markdown",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/jgehrcke/goeffel",
        "keywords": "",
        "license": "",
        "maintainer": "",
        "maintainer_email": "",
        "name": "goeffel",
        "package_url": "https://pypi.org/project/goeffel/",
        "platform": "",
        "project_url": "https://pypi.org/project/goeffel/",
        "project_urls": {
            "Homepage": "https://github.com/jgehrcke/goeffel"
        },
        "release_url": "https://pypi.org/project/goeffel/0.3.0/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "Measures the resource utilization of a specific process over time",
        "version": "0.3.0"
    },
    "last_serial": 5861194,
    "releases": {
        "0.2.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "7bb8c9e5c31d0e3bea9e3b5c8022c733",
                    "sha256": "1e4f498081bbd9b0b5284b755d5457ef6bda4f036355720e0b7f6b69ce48feb0"
                },
                "downloads": -1,
                "filename": "goeffel-0.2.0.tar.gz",
                "has_sig": false,
                "md5_digest": "7bb8c9e5c31d0e3bea9e3b5c8022c733",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 41438,
                "upload_time": "2019-08-13T18:26:27",
                "url": "https://files.pythonhosted.org/packages/c6/af/bb7e3af2e2e79673679cedef80380163524a8c688bddf0fbbc3837e35297/goeffel-0.2.0.tar.gz"
            }
        ],
        "0.3.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "7154b3346ba1bdc457de07919bdc9522",
                    "sha256": "a9fe2e0f842d397d54da6056e8b1f26a5cbd5c114e2f3928871088d64a870ac1"
                },
                "downloads": -1,
                "filename": "goeffel-0.3.0.tar.gz",
                "has_sig": false,
                "md5_digest": "7154b3346ba1bdc457de07919bdc9522",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 49837,
                "upload_time": "2019-09-20T10:20:57",
                "url": "https://files.pythonhosted.org/packages/b0/7f/5569b49c28502983740ca3b93a6b4ca6b19e99adc79b955c43b249970caa/goeffel-0.3.0.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "7154b3346ba1bdc457de07919bdc9522",
                "sha256": "a9fe2e0f842d397d54da6056e8b1f26a5cbd5c114e2f3928871088d64a870ac1"
            },
            "downloads": -1,
            "filename": "goeffel-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "7154b3346ba1bdc457de07919bdc9522",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 49837,
            "upload_time": "2019-09-20T10:20:57",
            "url": "https://files.pythonhosted.org/packages/b0/7f/5569b49c28502983740ca3b93a6b4ca6b19e99adc79b955c43b249970caa/goeffel-0.3.0.tar.gz"
        }
    ]
}