Skip to Content
StepsModel Training

Training Step (ModelGenerator)

A Training Step in lensless (referred to as ModelGenerator in flow steps) fine-tunes a base model on your custom dataset. It’s the backbone for use cases like personalized avatars, advanced product photography, or domain-specific text-to-image tasks.

If your flow involves training a model, most likely this is the step where you’ll need to tinker with to get the best results. You can also train models directly through the dashboard, as well as upload datasets and generate images using your trained models.

  • Models generated through the dashboard expire after a month unless specified otherwise.
  • Models generated through flows expire after a week.
  • Currently it’s not possible to download the generated model. It should only be used to generate images through lensless. If your use case requires downloading it, contact us.

How to Prepare a Dataset

To train a model on lensless, you need a dataset of images. Having a high quality dataset is the single most important factor when training a model. For example, if training a model to learn a person’s appearence, all images should have good lighting and resolution, include different angles and poses and contain only one person.

Since you don’t have control over the uploaded datasets for public flows, you should instruct your users on how to best select their images. There are also hyperparameters you can configure to try and minimize the impact of a suboptimal dataset, but you’ll need to experiment with it.

  • Upload via Dashboard:
    • Go to the Subjects section.
    • Create or select a subject.
    • Upload your images (e.g. .jpg or .png).
  • Upload via Flow Input:
    • Declare an input property with "dataset": true (e.g. "userDatasetId").
    • When a user runs the Flow, they’ll be prompted to upload files.
    • These files are automatically assembled into a dataset under your organization.

Each dataset has a unique id you can reference in the Flow’s training parameters.

Defining a Training flow step

In your Flow’s steps array, add a ModelGenerator object. For instance:

{ title: 'You with different recipes!', description: 'Train a model using your photos and generate images and recipes with you as the chef!', public: true, input: { type: 'object', required: ['userDatasetId'], properties: { userDatasetId: { max: 20, min: 4, type: 'string', title: 'Your photos', dataset: true, description: 'All photos should be of high quality. You must be alone and with different poses.', }, }, }, id: 'recipeModelGenerator', type: 'ModelGenerator', parameters: { settings: { datasetId: '$.input.userDatasetId', baseModel: 'stable-diffusion-v1-5/stable-diffusion-v1-5', epochs: 50, // ... your other hyperparameters (see below) }, }, }
  • id: a unique identificator for the step.
  • type: Must be “ModelGenerator”.
  • parameters.datasetId: Points to the dataset you want to use. It can be a static value (like a dataset uploaded through the dashboard) or a JSON Path reference (like $.input.userDatasetId).
  • parameters.settings: A block of hyperparameters to control how the training is performed (detailed below).

Hyperparameters

The settings object has a variety of fields to customize your training. Some settings have a subtle effect other can drastically impact the results.

FieldTypeDefaultDescription
baseModelstringnull”stable-diffusion-v1-5/stable-diffusion-v1-5” - A valid huggingface model ID to train on top of.
learningRatenumber1.0Main learning rate.
epochsnumber (1–10k)dynamicIf not provided, we will consider various factors such as model and size of the dataset to try to guess a good amount. We recommend experimenting and setting this parameter manually.
repeatsnumber (1–10k)1How many times each sample is repeated per epoch.
resolutionnumber (32–2048)1024Image resolution used for training.
normalizeDatasetbooleanfalseIf true, attempts to normalize images (cropping, resizing, etc.) before training.
maskedLossbooleanfalseEnables training with body/head/foreground masks for person images.
headMaskShadenumber (0–255)255If maskedLoss=true, else undefined Shade intensity for face region.
bodyMaskShadenumber (0–255)200If maskedLoss=true, else null. Shade intensity for main body region.
boundingMaskShadenumber (0–255)40If maskedLoss=true, else undefined Shade for bounding box.
optimizerenum:OptimizerProdigyProdigy, AdaFactor, AdamW, AdamW8bit, DAdaptation, DAdaptAdam
precisionenum:Precisionfp16Mixed-precision mode (fp16 or bf16).
schedulerenum:SchedulerCosineCosine, Constant, CosineAnnealing.
lossTypeenum:LossTypeL2L2, SmoothL1, Huber
networkDimnumber (0–256)64LoRA or additional network dimension.
alphaDimnumber=networkDimmatches networkDim if undefined. Typically the same or lesser than networkDim.
batchSizenumber (1–6)1Batch size per training step.
captionPrefixstringnullPrefix to add to generated or existing captions (e.g. a unique token like “lvff”).
trainUnetOnlybooleanfalseIf true, only train the U-Net portion, ignoring the text encoder.
biasCorrectionbooleanfalseRecommended for certain optimizers such as Prodigy.
weightDecaynumbernullRecommended for certain optimizers such as Prodigy.
d0numbernullRecommended for certain optimizers such as Prodigy.
decouplebooleanfalseRecommended for certain optimizers such as Prodigy.
dCoefnumbernullRecommended for certain optimizers such as Prodigy.
huberCnumber1.0 or nullRequired when using Huber loss type.
seednumbernullRecommended for having a more deterministic result. Useful when testing different hyperparameters
flipAugbooleanfalseUseful when dealing with very limited datasets. Should be used only for symmetrical subjects.
noiseOffsetnumbernullIf you enter a value greater than 0 here, additional noise will be added. Values range from 0 to 1, where 0 adds no noise at all. A value of 1 adds strong noise.
multiresNoiseIterationsnumbernullMultires creates noise of various resolutions and adds them together to create the final additive noise. Specify how many “various resolutions” to create.
multiresNoiseDiscountnumbernullIt is a numerical value for weakening the noise amount of each resolution to some extent. A value between 0 and 1, the lower the number, the weaker the noise.
timestepSamplingenum:TimestepSamplingnullHow to sample timesteps. sigma: sigma-based; uniform: uniform random; sigmoid: sigmoid of random normal; shift: shifts the value of sigmoid of normal distribution random; flux_shift: shifts the sigmoid of normal distribution random, depending on the resolution
guidanceScalenumbernull

Additional fields (like dCoef, biasCorrection, weightDecay, etc.) are used in advanced or specialized training scenarios. They have defaults that work well for typical trainings.

For a real-world example, see:

{ id: 'recipeModelTrainer', type: 'ModelGenerator', parameters: { datasetId: '$.input.userDatasetId', settings: { baseModel: 'Realistic_Vision_V6.0_B1', epochs: 80, repeats: 1, alphaDim: 64, decouple: true, batchSize: 1, optimizer: 'Prodigy', precision: 'fp16', scheduler: 'Cosine', maskedLoss: true, networkDim: 64, resolution: 1024, learningRate: 1, bodyMaskShade: 200, headMaskShade: 255, biasCorrection: true, normalizeDataset: true, boundingMaskShade: 40, backgroundMaskShade: 40, }, }, }

Training Templates

Repeatedly specifying the same training parameters can be tedious. Training Templates let you define a configuration once and reuse it.

  • Create Template in Dashboard:
  • Go to Trainings > New training template.
  • Fill out fields like base model, epochs, etc.
  • Save it as a template.
  • Select it on the dashboard when configuring your training or in case of flows:
  • Reference Template in a Flow by providing trainingSettingsId instead of a full settings object:
{ id: 'trainer', type: 'ModelGenerator', parameters: { trainingSettingsId: 'your-template-uuid', datasetId: '$.input.datasetId', }, }
  • Override: You can still override or supplement the template in your Flow’s settings block.

This feature helps ensure consistency across multiple projects—no need to retype baseModel, epochs, or specialized optimizer parameters over and over.

Key Points & Best Practices

  • Dataset Quality: The training results heavily depend on data quality. Ensure images are clear, properly represent your subject, have good lighting, are diverse, etc.
  • Base Model: Must be a valid Hugging Face repository name (e.g. “stable-diffusion-v1-5/stable-diffusion-v1-5”). If you see errors, verify your spelling and that the model is publicly accessible.
  • Cost: Training is billed at $0.06 per minute. Watch your logs and balance to avoid unexpected costs.
  • Overriding Defaults: Many fields (like betas, decouple) are specialized for advanced training. If unsure, start with the defaults—they work well for typical use cases.
  • You most likely will need to tinker with the hyperparameters to get the best result. When testing, setting a fixed seed is a good idea.
  • Some hyperparameters only works well when used in combination with other hyperparameters.
  • When testing, you should start by defining only a small subset of hyperparameters and leaving the rest as default. As the results get better, you can start experimenting with more.
Last updated on