Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • bob.pipelines bob.pipelines
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 5
    • Issues 5
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • bobbob
  • bob.pipelinesbob.pipelines
  • Merge requests
  • !28

Filelist datasets

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged André Anjos requested to merge datasets into master Apr 30, 2020
  • Overview 21
  • Commits 5
  • Pipelines 3
  • Changes 10

@tiago.pereira, @ydayer: here is the CSV and JSON implementations of filelist-based datasets I had in my package, for your review.

The way the API goes makes sense for an application scenario in which loading an individual sample is costly (e.g. sample data is stored on disk).

If the sample data is not stored on disk (e.g., a dataset that can be completely stored in a single CSV table), then using other techniques would be better (e.g. pandas data frames).

I hope it is useful.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: datasets