weka

Topics related to weka:

Getting started with weka

This section provides an overview of what weka is, and why a developer might want to use it.

It should also mention any large subjects within weka, and link out to the related topics. Since the Documentation for weka is new, you may need to create initial versions of those related topics.

Loading Instances

Text Classification

How to use R in Weka

Why use R in Weka?

  1. R is a powerful tool for preprocessing data
  2. R has a huge number of libraries and keeps growing
  3. R in Weka, can easily get data from, process it, and pass to Weka seamlessly

How to setup R in Weka

For Mac User

  1. replace the old info.Plist with the new one given by Mark Hall

  2. download and install R

  3. install rJava inside R with

    install.packages('rJava')

  4. install Rplugin with Weka Package Manager

  5. go to weka 3-8-0 folder (if it is the version you are using), and open its terminal, and

  6. run the following 2 lines of codes (thanks to Michael Hall)

    export R_HOME=/Library/Frameworks/R.framework/Resources
    java -Xss10M -Xmx4096M -cp .:weka.jar weka.gui.GUIChooser

  7. to make life easier, inside a directory where you want to work with weka, save the code above into a file named as weka_r.sh

  8. make it executable, inside this directory's terminal, run the code below:

    chmod a+x weka_r.sh

  9. paste weka.jar from weka 3-8-0 into the directory and run the code below:

    ./weka_r.sh

Now, you are ready to go. Next time, you just need to go to the directory's terminal and run ./weka_r.sh to start R with Weka.


How to receive data from Weka?

open Weka from terminal:
go to directory of Weka 3-8-0, open its terminal, run the following code:

java -jar weka.jar

data through Weka Explorer:

  1. preprocess panel, click open file, choose a data file from weka data folder;
  2. go to R console panel, type R scripts inside R console box.

data through Weka KnowledgeFlow:

  1. Data mining processes panel, click DataSources to choose ArffLoader for example, click it onto canvas;
  2. double-click ArffLoader to load a data file
  3. Scripting panel, click RscriptExecutor onto canvas
  4. option + click ArffLoader, select dataset, then click RScript Executor to link them
  5. double click RScript Executor to type R script, or
  6. click Settings and select R Scripting to use R console with weka's data

Playing R Codes

  1. load iris.arff with either Explorer or KnowledgeFlow;
  2. try Plotting inside R Console example above

How to use CPython Scripting in Weka?

How to install CPython in Weka?

Install wekaPython

  1. go to tools, open package manager
  2. search wekaPython, select and click to install

Install Python libraries

  1. install anaconda or conda
  2. install four packages: numpy, pandas, matplotlib, scikit-learn
  3. for full installation doc see conda

Simple Comparison of Weka Interfaces

Explorer

pro:

  1. do all things quickly
  2. give a quick and comprehensive view of data structure

cos: can't save the process;

Experimenter

pro:

  1. compare several models at once, e.g., run 3 different classifiers against 5 datasets all together, and see the compared result at one place;
  2. experiment can be saved

KnowledgeFlow

pro:

  1. do almost all things that Explorer can do
  2. can save the process

cos:

  1. KF can't do Experimenter's job, as it doesn't support loops, but ADAMS can help;
  2. KF can't access low-level functionalities inside Weka API;

simpleCLI

pro: run similar tasks of what Explorer does using command line

cos: it can't access all functionalities of Weka API, Jython or Groovy scripting is recommended for this task.

Workbench

pro: it gathers all other interfaces together into one place

Getting Started With Jython in Weka

How to setup Jython in weka

  1. install Jython and JFreeChart library from Weka Package manager;

  2. go to home directory's terminal, enter nano .bash_profile

  3. inside .bash_profile, add a line of code as below

    export Weka_Data=User/Documents/Directory/Of/Your/Data

  4. save and exit

  5. inside terminal run source .bash_profile

Then, restart Weka, go to tools and click Jython console, and you can try those examples above

Mistakes easily made when using KnowledgeFlow

TrainingSetMaker and TestSetMaker

  1. a ClassAssigner must be linked between ArffLoader and TrainingSetMaker or TestSetMaker.

ArffSaver

  1. In order to save dataset into arff file successfully, it is safer to set relationNameForFilename to False inside configuration of ArffSaver.

How to use TimeSeriesForecasting in KnowledgeFlow?

  1. Open knowledgeFlow, load dataset with ArffLoader
  2. go to setting, check time series forecasting perspective, right-click ArffLoader to send to all perspective
  3. go to time series forecasting perspective to set up a model
  4. run the model and copy the model to clipboard
  5. ctrl + v, and click to paste model to Data mining process canvas
  6. save prediction along with original data with ArffSaver