Defining a dataset

To start a new PKanalix project, you need to define a dataset by loading a file in the Data tab.

Supported file types: Supported file types include .txt, .csv, and .tsv files. Starting with version 2024, additional Excel and SAS file types are supported: .xls, .xlsx, .sas2bdat, and .xpt files in addition to .txt, .csv, and .tsv files.

The data set format expected in the Data tab is the same as for the entire MonolixSuite, to allow smooth transitions between applications. The columns available in this format and example datasets are detailed on this page. Briefly:

  • Each line corresponds to one individual and one time point
  • Each line can include a single measurement (also called observation), or a dose amount (or both a measurement and a dose amount)
  • Dosing information should be indicated for each individual, even if it is identical for all.

Your dataset may not be originally in this format, and you may want to add information on dose amounts, limits of quantification, units, or filter part of the dataset. To do so, you should proceed in this order:

  • 1) Formatting: If needed, format your data first by loading the dataset in the Data Formatting tab. Briefly, it allows to:
    • to deal with several header lines
    • merge several observation columns into one
    • add censoring information based on tags in the observation column
    • add treatment information manually or from external sources
    • add more columns based on another file

or

  • 1) If the data is already in the right format, load it directly in the Data tab (otherwise use the formatted dataset created by data formatting). If the dataset does not follow a formatting rule, the dataset will not be accepted, but errors will guide you to find what is missing and could be added by data formatting.
  • 2) Labeling: label the columns not recognized automatically to indicate their type and click on ACCEPT.
  • 3) Units: If you want to work with units, indicate the units in the data and the ones you prefer to use inside the Data tab (if relevant)
  • 4) Filters: If needed, filter your dataset to use only part of it in the Filters tab
  • 5) Explore: The interpreted dataset is displayed in Data, and Plots and covariate statistics are generated.

To use a dataset from a Monolix project, or to use simulations from a Simulx project, you can directly import or export the Monolix/Simulx project which will automatically define the dataset in the data tab.

 

Labeling

The column type suggested automatically by PKanalix based on the headers in the data can be customized in the preferences. By clicking on Settings>Preferences, the following windows pops up.

In the DATA frame, you can add or remove preferences for each column.
To remove a preference, double-click on the preference you would like to remove. A confirmation window will be proposed.
To add a preference, click on the header type you consider, add a name in the header name and click on “ADD HEADER” as on the following figure.

Notice that all the preferences are shared between Monolix, Datxplore, and PKanalix.

Starting from the version 2024, it is also possible to update the preferences with the columns tagged in the opened project, by clicking on the icon in the top left corner of the table:


Clicking on the icon will open a modal with the option to choose which of the tagged headers a user wants to add to preferences:

Dataset load times

Starting with the 2024 version, it is possible to improve the project load times, especially for projects with large datasets, but saving the data as a binary file. This option is available in Settings>Preferences and will save a copy of the data file in binary format in the results folder. When reloading a project, the dataset will be read from the binary file, which will be faster. If the original dataset file has been modified (compared to the binary), a warning message will appear, the binary dataset will not be used and the original dataset fiel will be loaded instead.

Select 'Save dataset as binary file' under Options.

Resulting plots and tables to explore the data

Once the dataset is accepted:

  • Plots are automatically generated based on the interpreted dataset to help you proceed with a first data exploration before running any task.

 

 

  • The interpreted dataset appears in Data tab, which incorporates all changes after formatting, setting units, and filtering.

 

 

  • Covariate Statistics appear in a section of the data tab.