Tags give the ability to mark specific points in history as being important
- Sort by
- Name
- Oldest updated
- Last updated
-
v1.0.0Release v1.0.0
- !37 Revert "For some reason, the class information is not passed in the sample wrapper": This reverts merge request !36
- !38 [sge] In dask some sublacessd classes need a config name. Fixes #20
- !40 Add dask-client configurations as resources: Fixes #19 Removes the sge-demanding configuration as all nodes at Idiap have a fast connection now. Depends on bob.bio.base!201
-
!39 [dask][sge] Added the variables
idle_timeout
andallowed_failures
as: part of our.bobrc
and added better defaults - !41 Added a GPU queue that defaults to short_gpu
- !43 Allow setting specific attributes of sample: Specify the sample attribute to assign the output of an estimator to, instead of 'data' in SampleWrapper. Specify the attribute of sample to save and load in CheckpointWrapper.
- !44 Fix sphinx warnings
- !45 Multiple Changes: * When checkpointing, checkpoing all steps in a pipeline * Better names in dask graph for FunctionTransformer * [xarray] Allow for multi argument transformers * SampleBatch in public API
- !46 move vstack_features to bob.io.base
-
!48 Improvements on CheckpointWrapper: Added the optional argument
hash_fn
in theCheckpointWrapper
class. Once this is set,sample.key
generates a hash code and this hash code is used to compose the final path wheresample
will be checkpointed. This is optional and generic enough for our purposes. This hash function can be shipped in the database interface. Closes #25 - !47 Multiple changes: * [DelayedSample] Allow for arbitrary delayed attributes * [SampleBatch] Allow other attributes than data Fixes #26 #24
- !49 [DelayedSample] Fix issues when an attribute was set
-
!50 [DelayedSample(Set)] make load and delayed_attributes private: This removes the need for a lot of guessing in downstream packages as they can start removing all keys that start with
_
when access of the sample's attribute is needed. -
!51 [dask][sge] Multiqueue updates: In this merge request I: - Simplified the way multi-queue is set in our scripts - Updated our Dask documentation Example ------- Setting the
fit
method to run onq_short_gpu
python pipeline = mario.wrap( ["sample", "checkpoint", "dask"], pipeline, model_path=model_path, fit_tag="q_short_gpu", )
You have to explicitly set the list of resource tags available.python pipeline.fit_transform(...).compute( scheduler=dask_client, resources=cluster.get_sge_resources()
-
!53 Updates: Implemented two updates in this MR - Removed the random behavior on the hash_string function (i had some problems in large scale tests). - Implemented the
DelayedSampleSetCached
. I need this behavior to speed-up the score computation. - !52 [CheckpointWrapper] Allow custom save and load functions through estimator tags
-
!54 Fixed multiqueue: Hi @amohammadi @ydayer I'm fixing here the issue raised with the multiqueue. I was wrongly setting all tasks to run in a particular resource restriction. Now the problem is fixed. To get it running you have to wrap your pipeline in the same way as before and fetch the resources like this
python pipeline = bob.pipelines.wrap( ["sample", "checkpoint", "dask"], pipeline, model_path="./", transform_extra_arguments=(("metadata", "metadata"),), fit_tag="q_short_gpu", ) from bob.pipelines.distributed.sge import get_resource_requirements resources = get_resource_requirements(pipeline) pipeline.fit_transform(X_as_sample).compute( scheduler=client, resources=resources )
- !56 Two new features: - Moved dask_get_partition_size from bob.bio.base to bob.pipelines - Updated the target duration of a task to 10s. Being very aggressive in scale-up
- !58 Moved the CSVBaseSampleLoader from bob.bio.base to bob.pipelines. This is a general function
- !55 Moved VALID_DASK_CLIENT_STRINGS to bob.pipelines
- !59 Dask client names
-
!60 CSVSampleLoaders as transformers: Made CSVSampleLoaders as scikit-learn transformers This is a good idea indeed. I made to classes. The
CSVToSampleLoader
converts one line to one sample; andAnnotationsLoader
that aggregates fromCSVToSampleLoader
to read annotations usingbob.db.base.read_anno...
. This is delayed. I'm already porting this stuff onbob.bio.base
. Code is way more cleaner. ping @amohammadi @ydayer Closes #30 -
!61 Fixed modules: config files from here are not available once
conda install bob.pipelines
- !62 Implement a new simple generic csv-based database interface: Depends on bob.extension!126
-
v0.0.1b0 First beta [skip-ci]