Python-API.md 22.8 KB
Newer Older
wxchan's avatar
wxchan committed
1
2
##Catalog

Guolin Ke's avatar
Guolin Ke committed
3
4
5
6
7
8
9
10
11
12
13
14
15
* [Data Structure API](Python-API.md#basic-data-structure-api)
    - [Dataset](Python-API.md#dataset)
    - [Booster](Python-API.md#booster)

* [Training API](Python-API.md#training-api)
    - [train](Python-API.md#trainparams-train_set-num_boost_round100-valid_setsnone-valid_namesnone-fobjnone-fevalnone-init_modelnone-feature_namenone-categorical_featurenone-early_stopping_roundsnone-evals_resultnone-verbose_evaltrue-learning_ratesnone-callbacksnone)
    - [cv](Python-API.md#cvparams-train_set-num_boost_round10-nfold5-stratifiedfalse-metricsnone-fobjnone-fevalnone-init_modelnone-feature_namenone-categorical_featurenone-early_stopping_roundsnone-fpreprocnone-verbose_evalnone-show_stdvtrue-seed0-callbacksnone)

* [Scikit-learn API](Python-API.md#scikit-learn-api)
    - [Common Methods](Python-API.md#common-methods)
    - [LGBMClassifier](Python-API.md#lgbmclassifier)
    - [LGBMRegressor](Python-API.md#lgbmregressor)
    - [LGBMRanker](Python-API.md#lgbmranker)
wxchan's avatar
wxchan committed
16
17
18
    
The methods of each Class is in alphabetical order.

wxchan's avatar
wxchan committed
19
----
wxchan's avatar
wxchan committed
20
21
22

##Basic Data Structure API

wxchan's avatar
wxchan committed
23
###Dataset
wxchan's avatar
wxchan committed
24

wxchan's avatar
wxchan committed
25
26
27
28
####__init__(data, label=None, max_bin=255, reference=None, weight=None, group=None, silent=False, feature_name=None, categorical_feature=None, params=None, free_raw_data=True)

    Parameters
    ----------
wxchan's avatar
wxchan committed
29
    data : str/numpy array/scipy.sparse
wxchan's avatar
wxchan committed
30
31
32
33
34
35
36
37
        Data source of Dataset.
        When data type is string, it represents the path of txt file
    label : list or numpy 1-D array, optional
        Label of the data
    max_bin : int, required
        Max number of discrete bin for features
    reference : Other Dataset, optional
        If this dataset validation, need to use training data as reference
wxchan's avatar
wxchan committed
38
    weight : list or numpy 1-D array, optional
wxchan's avatar
wxchan committed
39
        Weight for each instance.
wxchan's avatar
wxchan committed
40
    group : list or numpy 1-D array, optional
wxchan's avatar
wxchan committed
41
42
43
44
45
        Group/query size for dataset
    silent : boolean, optional
        Whether print messages during construction
    feature_name : list of str
        Feature names
wxchan's avatar
wxchan committed
46
    categorical_feature : list of str or list of int
wxchan's avatar
wxchan committed
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
        Categorical features,
        type int represents index,
        type str represents feature names (need to specify feature_name as well)
    params: dict, optional
        Other parameters
    free_raw_data: Bool
        True if need to free raw data after construct inner dataset
    

####construct()

    Lazy init
    

####create_valid(data, label=None, weight=None, group=None, silent=False, params=None)

wxchan's avatar
wxchan committed
63
    Create validation data align with current dataset.
wxchan's avatar
wxchan committed
64
65
66

    Parameters
    ----------
wxchan's avatar
wxchan committed
67
    data : str/numpy array/scipy.sparse
wxchan's avatar
wxchan committed
68
69
70
71
        Data source of _InnerDataset.
        When data type is string, it represents the path of txt file
    label : list or numpy 1-D array, optional
        Label of the training data.
wxchan's avatar
wxchan committed
72
    weight : list or numpy 1-D array, optional
wxchan's avatar
wxchan committed
73
        Weight for each instance.
wxchan's avatar
wxchan committed
74
    group : list or numpy 1-D array, optional
wxchan's avatar
wxchan committed
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
        Group/query size for dataset
    silent : boolean, optional
        Whether print messages during construction
    params: dict, optional
        Other parameters
    

####get_group()

    Get the initial score of the Dataset.

    Returns
    -------
    init_score : array
    

####get_init_score()

    Get the initial score of the Dataset.

    Returns
    -------
    init_score : array
    

####get_label()

    Get the label of the Dataset.

    Returns
    -------
    label : array
    

####get_weight()

    Get the weight of the Dataset.

    Returns
    -------
    weight : array
    

####num_data()

    Get the number of rows in the Dataset.

    Returns
    -------
    number of rows : int
    

####num_feature()

    Get the number of columns (features) in the Dataset.

    Returns
    -------
    number of columns : int
    

####save_binary(filename)

wxchan's avatar
wxchan committed
138
    Save Dataset to binary file.
wxchan's avatar
wxchan committed
139
140
141

    Parameters
    ----------
wxchan's avatar
wxchan committed
142
    filename : str
wxchan's avatar
wxchan committed
143
144
145
146
147
        Name of the output file.
    

####set_categorical_feature(categorical_feature)

wxchan's avatar
wxchan committed
148
    Set categorical features.
wxchan's avatar
wxchan committed
149
150
151

    Parameters
    ----------
wxchan's avatar
wxchan committed
152
153
    categorical_feature : list of str or list of int
        Name (str) or index (int) of categorical features
wxchan's avatar
wxchan committed
154
155
156
157
158

    

####set_feature_name(feature_name)

wxchan's avatar
wxchan committed
159
    Set feature name.
wxchan's avatar
wxchan committed
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182

    Parameters
    ----------
    feature_name : list of str
        Feature names
    

####set_group(group)

    Set group size of Dataset (used for ranking).

    Parameters
    ----------
    group : numpy array or list or None
        Group size of each group
    

####set_init_score(init_score)

    Set init score of booster to start from.

    Parameters
    ----------
wxchan's avatar
wxchan committed
183
    init_score : numpy array or list or None
wxchan's avatar
wxchan committed
184
185
186
187
188
        Init score for booster
    

####set_label(label)

wxchan's avatar
wxchan committed
189
    Set label of Dataset.
wxchan's avatar
wxchan committed
190
191
192

    Parameters
    ----------
wxchan's avatar
wxchan committed
193
    label : numpy array or list or None
wxchan's avatar
wxchan committed
194
195
196
197
198
        The label information to be set into Dataset
    

####set_reference(reference)

wxchan's avatar
wxchan committed
199
    Set reference dataset.
wxchan's avatar
wxchan committed
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218

    Parameters
    ----------
    reference : Dataset
        Will use reference as template to consturct current dataset
    

####set_weight(weight)

    Set weight of each instance.

    Parameters
    ----------
    weight : numpy array or list or None
        Weight for each data point
    

####subset(used_indices, params=None)

wxchan's avatar
wxchan committed
219
    Get subset of current dataset.
wxchan's avatar
wxchan committed
220
221
222
223
224
225
226
227
228
229

    Parameters
    ----------
    used_indices : list of int
        Used indices of this subset
    params : dict
        Other parameters
    

###Booster
wxchan's avatar
wxchan committed
230

wxchan's avatar
wxchan committed
231
232
233
234
235
236
237
238
239
240
####__init__(params=None, train_set=None, model_file=None, silent=False)

    Initialize the Booster.

    Parameters
    ----------
    params : dict
        Parameters for boosters.
    train_set : Dataset
        Training dataset
wxchan's avatar
wxchan committed
241
    model_file : str
wxchan's avatar
wxchan committed
242
243
244
245
246
247
248
        Path to the model file.
    silent : boolean, optional
        Whether print messages during construction
    

####add_valid(data, name)

wxchan's avatar
wxchan committed
249
    Add an validation data.
wxchan's avatar
wxchan committed
250
251
252
253
254

    Parameters
    ----------
    data : Dataset
        Validation data
wxchan's avatar
wxchan committed
255
    name : str
wxchan's avatar
wxchan committed
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
        Name of validation data
    

####attr(key)

    Get attribute string from the Booster.

    Parameters
    ----------
    key : str
        The key to get attribute from.

    Returns
    -------
    value : str
        The attribute value of the key, returns None if attribute do not exist.
    

####current_iteration()

wxchan's avatar
wxchan committed
276
277
278
279
280
281
282
    Get current number of iterations.

    Returns
    -------
    result : int
        Current number of iterations

wxchan's avatar
wxchan committed
283
284
####dump_model()

wxchan's avatar
wxchan committed
285
    Dump model to json format.
wxchan's avatar
wxchan committed
286
287
288

    Returns
    -------
wxchan's avatar
wxchan committed
289
290
    result : dict or list
        Json format of model
wxchan's avatar
wxchan committed
291
292
293
294
    

####eval(data, name, feval=None)

wxchan's avatar
wxchan committed
295
    Evaluate for data.
wxchan's avatar
wxchan committed
296
297
298
299
300
301
302
303
304
305

    Parameters
    ----------
    data : _InnerDataset object
    name :
        Name of data
    feval : function
        Custom evaluation function.
    Returns
    -------
wxchan's avatar
wxchan committed
306
    result : list
wxchan's avatar
wxchan committed
307
308
309
310
311
        Evaluation result list.
    

####eval_train(feval=None)

wxchan's avatar
wxchan committed
312
    Evaluate for training data.
wxchan's avatar
wxchan committed
313
314
315
316
317
318
319
320
321
322
323
324
325
326

    Parameters
    ----------
    feval : function
        Custom evaluation function.

    Returns
    -------
    result: str
        Evaluation result list.
    

####eval_valid(feval=None)

wxchan's avatar
wxchan committed
327
    Evaluate for validation data.
wxchan's avatar
wxchan committed
328
329
330
331
332
333
334
335

    Parameters
    ----------
    feval : function
        Custom evaluation function.

    Returns
    -------
wxchan's avatar
wxchan committed
336
    result : str
wxchan's avatar
wxchan committed
337
338
339
        Evaluation result list.
    

wxchan's avatar
wxchan committed
340
####feature_importance(importance_type="split")
wxchan's avatar
wxchan committed
341

wxchan's avatar
wxchan committed
342
    Feature importances.
wxchan's avatar
wxchan committed
343
344
345

    Returns
    -------
wxchan's avatar
wxchan committed
346
347
    result : array
        Array of feature importances
wxchan's avatar
wxchan committed
348
349
350
351
    

####predict(data, num_iteration=-1, raw_score=False, pred_leaf=False, data_has_header=False, is_reshape=True)

wxchan's avatar
wxchan committed
352
    Predict logic.
wxchan's avatar
wxchan committed
353
354
355

    Parameters
    ----------
wxchan's avatar
wxchan committed
356
    data : str/numpy array/scipy.sparse
wxchan's avatar
wxchan committed
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
        Data source for prediction
        When data type is string, it represents the path of txt file
    num_iteration : int
        Used iteration for prediction
    raw_score : bool
        True for predict raw score
    pred_leaf : bool
        True for predict leaf index
    data_has_header : bool
        Used for txt data
    is_reshape : bool
        Reshape to (nrow, ncol) if true

    Returns
    -------
    Prediction result
    

####reset_parameter(params)

wxchan's avatar
wxchan committed
377
    Reset parameters for booster.
wxchan's avatar
wxchan committed
378
379
380
381
382
383
384
385
386
387
388

    Parameters
    ----------
    params : dict
        New parameters for boosters
    silent : boolean, optional
        Whether print messages during construction
    

####rollback_one_iter()

wxchan's avatar
wxchan committed
389
    Rollback one iteration.
wxchan's avatar
wxchan committed
390
391
392
393
    

####save_model(filename, num_iteration=-1)

wxchan's avatar
wxchan committed
394
    Save model of booster to file.
wxchan's avatar
wxchan committed
395
396
397
398
399
400
401
402
403

    Parameters
    ----------
    filename : str
        Filename to save
    num_iteration: int
        Number of iteration that want to save. < 0 means save all
    

wxchan's avatar
wxchan committed
404
####set_attr(**kwargs)
wxchan's avatar
wxchan committed
405
406
407
408
409
410
411
412
413
414
415

    Set the attribute of the Booster.

    Parameters
    ----------
    **kwargs
        The attributes to set. Setting a value to None deletes an attribute.
    

####set_train_data_name(name)

wxchan's avatar
wxchan committed
416
417
418
419
420
421
422
    Set training data name.

    Parameters
    ----------
    name : str
        Name of training data.

wxchan's avatar
wxchan committed
423
424
####update(train_set=None, fobj=None)

wxchan's avatar
wxchan committed
425
    Update for one iteration.
wxchan's avatar
wxchan committed
426
427
    Note: for multi-class task, the score is group by class_id first, then group by row_id
          if you want to get i-th row score in j-th class, the access way is score[j*num_data+i]
wxchan's avatar
wxchan committed
428
          and you should group grad and hess in this way as well.
wxchan's avatar
wxchan committed
429
430
431
432
433
434
435
436
437
438
439
440
441
442

    Parameters
    ----------
    train_set :
        Training data, None means use last training data
    fobj : function
        Customized objective function.

    Returns
    -------
    is_finished, bool
    

##Training API
wxchan's avatar
wxchan committed
443

wxchan's avatar
wxchan committed
444
445
446
447
448
449
450
451
452
453
454
455
456
457
####train(params, train_set, num_boost_round=100, valid_sets=None, valid_names=None, fobj=None, feval=None, init_model=None, feature_name=None, categorical_feature=None, early_stopping_rounds=None, evals_result=None, verbose_eval=True, learning_rates=None, callbacks=None)

    Train with given parameters.

    Parameters
    ----------
    params : dict
        Parameters for training.
    train_set : Dataset
        Data to be trained.
    num_boost_round: int
        Number of boosting iterations.
    valid_sets: list of Datasets
        List of data to be evaluated during training
wxchan's avatar
wxchan committed
458
    valid_names: list of str
wxchan's avatar
wxchan committed
459
460
461
462
463
464
465
466
467
468
        Names of valid_sets
    fobj : function
        Customized objective function.
    feval : function
        Customized evaluation function.
        Note: should return (eval_name, eval_result, is_higher_better) of list of this
    init_model : file name of lightgbm model or 'Booster' instance
        model used for continued train
    feature_name : list of str
        Feature names
wxchan's avatar
wxchan committed
469
    categorical_feature : list of str or list of int
wxchan's avatar
wxchan committed
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
        Categorical features,
        type int represents index,
        type str represents feature names (need to specify feature_name as well)
    early_stopping_rounds: int
        Activates early stopping.
        Requires at least one validation data and one metric
        If there's more than one, will check all of them
        Returns the model with (best_iter + early_stopping_rounds)
        If early stopping occurs, the model will add 'best_iteration' field
    evals_result: dict or None
        This dictionary used to store all evaluation results of all the items in valid_sets.
        Example: with a valid_sets containing [valid_set, train_set]
                 and valid_names containing ['eval', 'train']
                 and a paramater containing ('metric':'logloss')
        Returns: {'train': {'logloss': ['0.48253', '0.35953', ...]},
                  'eval': {'logloss': ['0.480385', '0.357756', ...]}}
        passed with None means no using this function
    verbose_eval : bool or int
        Requires at least one item in evals.
        If `verbose_eval` is True,
            the eval metric on the valid set is printed at each boosting stage.
        If `verbose_eval` is int,
            the eval metric on the valid set is printed at every `verbose_eval` boosting stage.
        The last boosting stage
            or the boosting stage found by using `early_stopping_rounds` is also printed.
        Example: with verbose_eval=4 and at least one item in evals,
            an evaluation metric is printed every 4 (instead of 1) boosting stages.
    learning_rates: list or function
        List of learning rate for each boosting round
        or a customized function that calculates learning_rate in terms of
        current number of round (and the total number of boosting round)
        (e.g. yields learning rate decay)
        - list l: learning_rate = l[current_round]
        - function f: learning_rate = f(current_round, total_boost_round)
                   or learning_rate = f(current_round)
    callbacks : list of callback functions
        List of callback functions that are applied at end of each iteration.

    Returns
    -------
    booster : a trained booster model
    

####cv(params, train_set, num_boost_round=10, nfold=5, stratified=False, metrics=None, fobj=None, feval=None, init_model=None, feature_name=None, categorical_feature=None, early_stopping_rounds=None, fpreproc=None, verbose_eval=None, show_stdv=True, seed=0, callbacks=None)

    Cross-validation with given paramaters.

    Parameters
    ----------
    params : dict
        Booster params.
    train_set : Dataset
        Data to be trained.
    num_boost_round : int
        Number of boosting iterations.
    nfold : int
        Number of folds in CV.
    stratified : bool
        Perform stratified sampling.
    folds : a KFold or StratifiedKFold instance
        Sklearn KFolds or StratifiedKFolds.
wxchan's avatar
wxchan committed
531
    metrics : str or list of str
wxchan's avatar
wxchan committed
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
        Evaluation metrics to be watched in CV.
    fobj : function
        Custom objective function.
    feval : function
        Custom evaluation function.
    init_model : file name of lightgbm model or 'Booster' instance
        model used for continued train
    feature_name : list of str
        Feature names
    categorical_feature : list of str or int
        Categorical features, type int represents index,
        type str represents feature names (need to specify feature_name as well)
    early_stopping_rounds: int
        Activates early stopping. CV error needs to decrease at least
        every <early_stopping_rounds> round(s) to continue.
        Last entry in evaluation history is the one from best iteration.
    fpreproc : function
        Preprocessing function that takes (dtrain, dtest, param)
        and returns transformed versions of those.
    verbose_eval : bool, int, or None, default None
        Whether to display the progress.
        If None, progress will be displayed when np.ndarray is returned.
        If True, progress will be displayed at boosting stage.
        If an integer is given,
            progress will be displayed at every given `verbose_eval` boosting stage.
    show_stdv : bool, default True
        Whether to display the standard deviation in progress.
        Results are not affected, and always contains std.
    seed : int
        Seed used to generate the folds (passed to numpy.random.seed).
    callbacks : list of callback functions
        List of callback functions that are applied at end of each iteration.

    Returns
    -------
wxchan's avatar
wxchan committed
567
    evaluation history : list of str
wxchan's avatar
wxchan committed
568
569
570
    

##Scikit-learn API
wxchan's avatar
wxchan committed
571

wxchan's avatar
wxchan committed
572
###Common Methods
wxchan's avatar
wxchan committed
573
574

####__init__(boosting_type="gbdt", num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=10, max_bin=255, silent=True, objective="regression", nthread=-1, min_split_gain=0, min_child_weight=5, min_child_samples=10, subsample=1, subsample_freq=1, colsample_bytree=1, reg_alpha=0, reg_lambda=0, scale_pos_weight=1, is_unbalance=False, seed=0)
wxchan's avatar
wxchan committed
575
576
577
578
579

    Implementation of the Scikit-Learn API for LightGBM.

    Parameters
    ----------
wxchan's avatar
wxchan committed
580
    boosting_type : str
581
582
        gbdt, traditional Gradient Boosting Decision Tree
        dart, Dropouts meet Multiple Additive Regression Trees
wxchan's avatar
wxchan committed
583
584
585
586
587
588
589
590
591
592
    num_leaves : int
        Maximum tree leaves for base learners.
    max_depth : int
        Maximum tree depth for base learners, -1 means no limit.
    learning_rate : float
        Boosting learning rate
    n_estimators : int
        Number of boosted trees to fit.
    silent : boolean
        Whether to print messages while running boosting.
wxchan's avatar
wxchan committed
593
    objective : str or callable
wxchan's avatar
wxchan committed
594
595
        Specify the learning task and the corresponding learning objective or
        a custom objective function to be used (see note below).
wxchan's avatar
wxchan committed
596
        default: binary for LGBMClassifier, regression for LGBMRegressor, lambdarank for LGBMRanker
wxchan's avatar
wxchan committed
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
    nthread : int
        Number of parallel threads
    min_split_gain : float
        Minimum loss reduction required to make a further partition on a leaf node of the tree.
    min_child_weight : int
        Minimum sum of instance weight(hessian) needed in a child(leaf)
    min_child_samples : int
        Minimum number of data need in a child(leaf)
    subsample : float
        Subsample ratio of the training instance.
    subsample_freq : int
        frequence of subsample, <=0 means no enable
    colsample_bytree : float
        Subsample ratio of columns when constructing each tree.
    reg_alpha : float
        L1 regularization term on weights
    reg_lambda : float
        L2 regularization term on weights
    scale_pos_weight : float
        Balancing of positive and negative weights.
    is_unbalance : bool
        Is unbalance for binary classification
    seed : int
        Random number seed.

    Note
    ----
    A custom objective function can be provided for the ``objective``
    parameter. In this case, it should have the signature
    ``objective(y_true, y_pred) -> grad, hess`` 
        or ``objective(y_true, y_pred, group) -> grad, hess``:

        y_true: array_like of shape [n_samples]
            The target values
        y_pred: array_like of shape [n_samples] or shape[n_samples* n_class]
            The predicted values
        group: array_like
            group/query data, used for ranking task
        grad: array_like of shape [n_samples] or shape[n_samples* n_class]
            The value of the gradient for each sample point.
        hess: array_like of shape [n_samples] or shape[n_samples* n_class]
            The value of the second derivative for each sample point

    for multi-class task, the y_pred is group by class_id first, then group by row_id
        if you want to get i-th row y_pred in j-th class, the access way is y_pred[j*num_data+i]
        and you should group grad and hess in this way as well
    

####apply(X, num_iteration=0)

    Return the predicted leaf every tree for each sample.

    Parameters
    ----------
    X : array_like, shape=[n_samples, n_features]
        Input features matrix.

    num_iteration : int
        Limit number of iterations in the prediction; defaults to 0 (use all trees).

    Returns
    -------
    X_leaves : array_like, shape=[n_samples, n_trees]
    

####booster()

    Get the underlying lightgbm Booster of this model.
wxchan's avatar
wxchan committed
665
    This will raise an exception when it's called before fit().
wxchan's avatar
wxchan committed
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682

    Returns
    -------
    booster : a lightgbm booster of underlying model
    

####evals_result()

    Return the evaluation results.

    Returns
    -------
    evals_result : dictionary
    

####feature_importance()

wxchan's avatar
wxchan committed
683
    Return the feature importances of each feature.
wxchan's avatar
wxchan committed
684
685
686

    Returns
    -------
wxchan's avatar
wxchan committed
687
688
    result : array
        Array of normailized feature importances
wxchan's avatar
wxchan committed
689
690
691
692
    

####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None, other_params=None)

wxchan's avatar
wxchan committed
693
    Fit the gradient boosting model.
wxchan's avatar
wxchan committed
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757

    Parameters
    ----------
    X : array_like
        Feature matrix
    y : array_like
        Labels
    sample_weight : array_like
        weight of training data
    init_score : array_like
        init score of training data
    group : array_like
        group data of training data
    eval_set : list, optional
        A list of (X, y) tuple pairs to use as a validation set for early-stopping
    eval_sample_weight : List of array
        weight of eval data
    eval_init_score : List of array
        init score of eval data
    eval_group : List of array
        group data of eval data
    eval_metric : str, list of str, callable, optional
        If a str, should be a built-in evaluation metric to use.
        If callable, a custom evaluation metric, see note for more details.
    early_stopping_rounds : int
    verbose : bool
        If `verbose` and an evaluation set is used, writes the evaluation
    feature_name : list of str
        Feature names
    categorical_feature : list of str or int
        Categorical features,
        type int represents index,
        type str represents feature names (need to specify feature_name as well)
    other_params: dict
        Other parameters

    Note
    ----
    Custom eval function expects a callable with following functions:
        ``func(y_true, y_pred)``, ``func(y_true, y_pred, weight)``
            or ``func(y_true, y_pred, weight, group)``.
        return (eval_name, eval_result, is_bigger_better)
            or list of (eval_name, eval_result, is_bigger_better)

        y_true: array_like of shape [n_samples]
            The target values
        y_pred: array_like of shape [n_samples] or shape[n_samples * n_class] (for multi-class)
            The predicted values
        weight: array_like of shape [n_samples]
            The weight of samples
        group: array_like
            group/query data, used for ranking task
        eval_name: str
            name of evaluation
        eval_result: float
            eval result
        is_bigger_better: bool
            is eval result bigger better, e.g. AUC is bigger_better.
    for multi-class task, the y_pred is group by class_id first, then group by row_id
      if you want to get i-th row y_pred in j-th class, the access way is y_pred[j*num_data+i]
    

####get_params(deep=False)

wxchan's avatar
wxchan committed
758
    Get parameters.
wxchan's avatar
wxchan committed
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
    

####predict(data, raw_score=False, num_iteration=0)

    Return the predicted value for each sample.

    Parameters
    ----------
    X : array_like, shape=[n_samples, n_features]
        Input features matrix.

    num_iteration : int
        Limit number of iterations in the prediction; defaults to 0 (use all trees).

    Returns
    -------
    predicted_result : array_like, shape=[n_samples] or [n_samples, n_classes]
    

###LGBMClassifier

####predict_proba(data, raw_score=False, num_iteration=0)

    Return the predicted probability for each class for each sample.

    Parameters
    ----------
    X : array_like, shape=[n_samples, n_features]
        Input features matrix.

    num_iteration : int
        Limit number of iterations in the prediction; defaults to 0 (use all trees).

    Returns
    -------
    predicted_probability : array_like, shape=[n_samples, n_classes]
    

###LGBMRegressor

###LGBMRanker

####fit(X, y, sample_weight=None, init_score=None, group=None, eval_set=None, eval_sample_weight=None, eval_init_score=None, eval_group=None, eval_metric=None, eval_at=None, early_stopping_rounds=None, verbose=True, feature_name=None, categorical_feature=None, other_params=None)

wxchan's avatar
wxchan committed
803
    Most arguments are same as Common Methods except:
wxchan's avatar
wxchan committed
804
805
806
807

    eval_at : list of int
        The evaulation positions of NDCG