ezpz.examples.deepspeed.tp.train_batch_lengthΒΆ
DeepSpeed TP training script that pads sequences to max_length for benchmarking.
Launch with:
1 | |
Argparse help is available once optional dependencies (transformers/deepspeed) are installed:
1 | |
DataArguments
dataclass
ΒΆ
Data path configuration for supervised training.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
DataCollatorForSupervisedDataset
dataclass
ΒΆ
Bases: object
Collate examples for supervised fine-tuning.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
__call__(instances)
ΒΆ
Pad and batch token/label tensors.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
ModelArguments
dataclass
ΒΆ
SupervisedDataset
ΒΆ
Bases: Dataset
Dataset for supervised fine-tuning.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
__getitem__(i)
ΒΆ
__init__(data_path, tokenizer)
ΒΆ
Load, tokenize, and cache dataset for instruction tuning.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
TrainingArguments
dataclass
ΒΆ
Bases: TrainingArguments
Extended training arguments with padding/max length controls.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
make_supervised_data_module(tokenizer, data_args)
ΒΆ
Make dataset and collator for supervised fine-tuning.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
preprocess(sources, targets, tokenizer)
ΒΆ
Preprocess the data by tokenizing.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
smart_tokenizer_and_embedding_resize(special_tokens_dict, tokenizer, model)
ΒΆ
Resize tokenizer and embedding.
Note: This is the unoptimized version that may make your embedding size not be divisible by 64.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
train()
ΒΆ
Run supervised causal LM fine-tuning with DeepSpeed/transformers.
Source code in src/ezpz/examples/deepspeed/tp/train_batch_length.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |