Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
tianlh
LightGBM-DCU
Commits
68bc8a61
Commit
68bc8a61
authored
Jan 26, 2017
by
Guolin Ke
Browse files
not limit max_bin for categorical feature
parent
8980fc72
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
9 additions
and
10 deletions
+9
-10
src/io/bin.cpp
src/io/bin.cpp
+9
-10
No files found.
src/io/bin.cpp
View file @
68bc8a61
...
...
@@ -171,18 +171,17 @@ void BinMapper::FindBin(const std::string& column_name, std::vector<double>* val
// sort by counts
Common
::
SortForPair
<
int
,
int
>
(
counts_int
,
distinct_values_int
,
0
,
true
);
// will ingore the categorical of small counts
num_bin_
=
std
::
min
(
max_bin
,
static_cast
<
int
>
(
counts_int
.
size
())
);
const
int
cut_cnt
=
static_cast
<
int
>
(
sample_size
*
0.95
f
);
categorical_2_bin_
.
clear
();
bin_2_categorical_
=
std
::
vector
<
int
>
(
num_bin_
);
bin_2_categorical_
.
clear
();
num_bin_
=
0
;
int
used_cnt
=
0
;
for
(
int
i
=
0
;
i
<
num_bin_
;
++
i
)
{
bin_2_categorical_
[
i
]
=
distinct_values_int
[
i
];
categorical_2_bin_
[
distinct_values_int
[
i
]]
=
static_cast
<
unsigned
int
>
(
i
);
used_cnt
+=
counts_int
[
i
];
}
if
(
used_cnt
/
static_cast
<
double
>
(
sample_size
)
<
0.95
f
)
{
Log
::
Warning
(
"Too many categoricals are ignored, \
please use bigger max_bin or partition column
\"
%s
\"
"
,
column_name
.
c_str
());
max_bin
=
std
::
min
(
static_cast
<
int
>
(
distinct_values_int
.
size
()),
max_bin
);
while
(
used_cnt
<
cut_cnt
||
num_bin_
<
max_bin
)
{
bin_2_categorical_
.
push_back
(
distinct_values_int
[
num_bin_
]);
categorical_2_bin_
[
distinct_values_int
[
num_bin_
]]
=
static_cast
<
unsigned
int
>
(
num_bin_
);
used_cnt
+=
counts_int
[
num_bin_
];
++
num_bin_
;
}
cnt_in_bin
=
counts_int
;
cnt_in_bin
[
0
]
+=
static_cast
<
int
>
(
sample_size
)
-
used_cnt
;
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment