Commit 8733ffcb authored by Matthew Carrigan's avatar Matthew Carrigan
Browse files

Removing a couple of other old unnecessary comments

parent 8a861048
# Step 1: Slurp the dataset up, tokenize each sentence, and store as docs -> sentences -> tokens
# Step 2: Walk over the dataset, using the Google BERT logic to concatenate sentences into training examples
# Step 3: Write out the examples, possibly as Torch tensors?
from argparse import ArgumentParser from argparse import ArgumentParser
from pathlib import Path from pathlib import Path
from tqdm import tqdm, trange from tqdm import tqdm, trange
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment