Mistral
Getting Started
Overview
Installation
Configuration
Training
Download Models
Evaluation
Tutorials
Training With Multiple GPU's
Training On Multiple Nodes With DeepSpeed
Resuming From Checkpoint
Generate Text With A Trained Model
Training A Model With Google Cloud + Kubernetes
About
Contributing
API reference
src.args
src.core
src.corpora
src.corpora.auto
src.corpora.auto.auto_detokenize
src.corpora.auto.build_indexed_dataset
src.corpora.auto.get_auto_dataset
src.corpora.auto.get_lambada
src.corpora.detokenization
src.corpora.indexer
src.corpora.tokenization_utils
src.models
src.overwatch
src.util
Differences between Mistral and Hugging Face
Mistral
»
src
»
src.corpora
»
src.corpora.auto
»
src.corpora.auto.auto_detokenize
src.corpora.auto.auto_detokenize
¶
auto_detokenize
(
dataset_id
:
str
,
dataset
:
DatasetDict
,
preprocess_path
:
Path
,
preprocessing_num_proc
:
int
=
4
)
→
DatasetDict
[source]
¶