support more flexible setting of conv head; slice inputs when batch size is too large in PFNLayer to avoid bugs (#124) * support more flexible setting * slice inputs of nn.Linear when batch size is too large