I*****a 发帖数: 5425 | 1 hi guys, i have a question about clustering analysis with both numerical
variables and categorical(nominal) variables. I am not very familiar with
clustering analysis. Any feedback will be appreciated.Can only type chinese
using phone, which is too much pain... sorry.
1) What are the standard ways to deal with categorical variables ? Do we
simply transform them to a lot of dummy variables ? In my particular problem
, I have a pretty large dataset, where some variables may have hundred
thousands of categories. I don't know if it can be problematic.
2) Is there a rule of thumb to relatively scale the categorical variables
with the numerical ones ?
3) What R packages do people usually use to deal with this kind of problems ?
Thanks. | t*********e 发帖数: 71 | 2 group your numeric var to categ var,then use latent class analysis method |
|