Skip to content

Dataloader Factory

The dataloader factory produces dataloader objects based off of parameter inputs provided in the config file. The parameters needed in the config are defined in the get_dataloader() method of the factory. At the very least, a name must be provided to select the dataloader from all those that are available.

Adding New Dataloader to Factory

Two actions must be performed to add a new dataloader to the Factory object. Optionally, a third may be added if you want to add a new default value for a parameter the dataloader requires. The actions are:

  • Import the dataloader
  • Add an entry to the dataloader_requirements dictionary

Example

In this example, a new scalar dataloader myScalarDataloader has been created, and is located at meshiphi.Dataloaders/Scalar/myScalarDataloader.py.

The only parameter required by this dataloader is a file to read data from. 'files' is passed as a mandatory parameter, as 'file' and 'folder' both get translated into a list of files, and stored in params under the key files:

# Add new import statement for Factory to read
from meshiphi.Dataloaders.Scalar.myScalarDataloader import myScalarDataloader

...

class DataLoaderFactory:
   ...
   def get_dataloader(self, name, bounds, params, min_dp=5):
      ...
      dataloader_requirements = {
         ...
         # Add new dataloaders
         'myscalar':    (myScalarDataloader, ['files'])
         ...
      ...
   ...

To call this dataloader, add an entry in the config.json file used to generate the mesh. Alternatively, add a folder, or a list of individual files:

{
      "loader": "myscalar",
      "params": {
         "file": "PATH_TO_DATA_FILE"   # For a single file
         "folder": "PATH_TO_FOLDER"    # For a folder, must have trailing '/'
         "files":[                     # For a list of individual files
            "PATH_TO_FILE_1",
            "PATH_TO_FILE_2",
            ...
         ]
      }
}

Dataloader Factory Object

DataLoaderFactory

Produces initialised DataLoader objects that can be used by the mesh to quickly retrieve values within a boundary.

get_dataloader(name, bounds, params, min_dp=5) staticmethod

Creates appropriate dataloader object based on name

Parameters:

Name Type Description Default
name str

Name of data source/type. Must be one of following - 'scalar_csv', 'scalar_grf', 'binary_grf', 'amsr', 'bsose_sic', 'bsose_depth', 'baltic_sic', 'gebco', 'icenet', 'modis', 'thickness', 'density', 'circle', 'square', 'gradient', 'checkerboard', 'vector_csv', 'vector_grf', 'baltic_currents', 'era5_wind', 'northsea_currents', 'oras5_currents', 'sose', 'duacs_currents', 'era5_wave_height', 'era5_wave_direction'

required
bounds Boundary

Boundary object with initial mesh space&time limits

required
params dict

Dictionary of parameters required by each dataloader

required
min_dp int

Minimum datapoints required to get homogeneity condition

5

Returns:

Type Description
Scalar/Vector/LUT DataLoader

DataLoader object of correct type, with required params set

translate_file_input(params) staticmethod

Allows flexible file specification in params. Translates 'file' or 'folder' into 'files'

Parameters:

Name Type Description Default
params dict

Dictionary of parameters written in config

required