Running Weka’s algorithms from command line,
requires a very simple setup of Weka to be in place. All you need is to
download latest release of WEKA. One of useful links working at the time of
writing this post is:
Next, you’ll need to unzip this setup, which would give you a
directory with name “weka-3-6-9”. We would call it WEKA_HOME for reference in
this blog post.
You might want to run Weka’s logistic regression algorithm on
two types of input data.
- One is the sample data files in ARFF format already available in “WEKA_HOME/data”
- Other is over some data files you already have in CSV format with you. For example, donut.csv file provided by Mahout for running it’s Logistic Regression over it.
Running LR over ARFF files
We would be using the file “WEKA_HOME/data/weather.nominal.arff”
for running the algorithm. Cd to WEKA_HOME and run the following command
java -cp ./weka.jar weka.classifiers.functions.Logistic -t WEKA_HOME/weather.nominal.arff
-T WEKA_HOME/weather.nominal.arff -d /some_location_on_your_machine/weather.nominal.model.arff
|
which should generate the trained model at “/some_location_on_your_machine/weather.nominal.model.arff”
and the console output should look something like:
Logistic Regression with ridge parameter of 1.0E-8
Coefficients... Class Variable yes ========================================= outlook=sunny -45.2378 outlook=overcast 57.5375 outlook=rainy -5.9067 temperature=hot -8.3327 temperature=mild 44.8546 temperature=cool -45.4929 humidity 118.1425 windy 72.9648 Intercept -89.2032 Odds Ratios... Class Variable yes ========================================= outlook=sunny 0 outlook=overcast 9.73275593611619E24 outlook=rainy 0.0027 temperature=hot 0.0002 temperature=mild 3.020787521374072E19 temperature=cool 0 humidity 2.0353933107400553E51 windy 4.877521304260806E31 Time taken to build model: 0.12 seconds Time taken to test model on training data: 0.01 seconds === Error on training data === Correctly Classified Instances 14 100 % Incorrectly Classified Instances 0 0 % Kappa statistic 1 Mean absolute error 0 Root mean squared error 0 Relative absolute error 0.0002 % Root relative squared error 0.0008 % Total Number of Instances 14 === Confusion Matrix === a b <-- classified as 9 0 | a = yes 0 5 | b = no === Error on test data === Correctly Classified Instances 14 100 % Incorrectly Classified Instances 0 0 % Kappa statistic 1 Mean absolute error 0 Root mean squared error 0 Relative absolute error 0.0002 % Root relative squared error 0.0008 % Total Number of Instances 14 === Confusion Matrix === a b <-- classified as 9 0 | a = yes 0 5 | b = no |
Here the three arguments mean:
- -t <name of training file> : Sets training file.
- -T <name of test file> : Sets test file. If missing, a cross-validation will be performed on the training data.
- -d <name of output file> : Sets model output file. In case the filename ends with '.xml', only the options are saved to the XML file, not the model.
For help on all available arguments, try running the following
command from WEKA_HOME:
java -cp ./weka.jar weka.classifiers.functions.Logistic -h
|
Running LR over CSV files
For running Weka’s LR over a CSV file, you’ll need to convert it
into ARFF format using a converter provided by WEKA. Using command line in
linux, here are the steps:
Step-I: Convert the data into arff format, for converting from
CSV to ARFF, run the following command from WEKA_HOME:
java -cp ./weka.jar weka.core.converters.CSVLoader someCSVFile.csv
> outputARFFFile.arff
|
Step-II: Run the NumericToNominal filter over the arff file
java -cp ./weka.jar weka.filters.unsupervised.attribute.NumericToNominal
-i outputARFFFile.arff -o outputARFFFile.nominal.arff
|
Step-III: Run the classifier over the outputARFFFile.nominal.arff
java -cp ./weka.jar weka.classifiers.functions.Logistic -t outputARFFFile.nominal.arff
-T outputARFFFile.nominal.arff -d outputARFFFile.nominal.model.arff
|
You might encounter an exception stating
"Cannot handle unary class!"
|
To resolve this, apply the attribute filter and eliminate the
attribute which has same value for all the records in the file using:
java -cp ./weka.jar weka.filters.AttributeFilter -i outputARFFFile.nominal.arff
-o outputARFFFile.filtered.nominal.arff -R 8
|
where the value of “–R” would vary depending upon your input
file and the id of attribute to be eliminated in the input arff file.
After this, try running the classifier on the obtained “outputARFFFile.filtered.nominal.arff”
file as in:
java -cp ./weka.jar weka.classifiers.functions.Logistic -t outputARFFFile.filtered.nominal.arff
-T outputARFFFile.filtered.nominal.arff -d outputARFFFile.nominal.model.arff
|
The output should appear somewhat like we got when running the classifier over the provided sample data mentioned above.
With these steps, you are ready to play with WEKA. Go for it. Cheers
!!!
Thanks, this was useful!
ReplyDeleteThanks for post:
ReplyDeleteship hỏa tốc sang Nepal
ship nhanh đi Nepal
ship nhanh tới Nepal
vận chuyển bưu phẩm đi Nepal
ship tốc độ đi Nepal
www.caycotacdunggi.info