Commit 8733ffcb authored by Matthew Carrigan's avatar Matthew Carrigan
Browse files

Removing a couple of other old unnecessary comments

parent 8a861048
# Step 1: Slurp the dataset up, tokenize each sentence, and store as docs -> sentences -> tokens
# Step 2: Walk over the dataset, using the Google BERT logic to concatenate sentences into training examples
# Step 3: Write out the examples, possibly as Torch tensors?
from argparse import ArgumentParser
from pathlib import Path
from tqdm import tqdm, trange
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment