RESOURCES
A. SOFTWARE
General Purpose Data Mining
WEKA
(Source: Java)
MLC++
(Source: C++)
SIPINA
List from KDNuggets
(Various)
List from Data Management Center
(Various)
Classification
C4.5
(Decision tree)
OC1
(Oblique decision tree)
Ripper
(Rule-based)
CBA
(association-rule based)
bayes
(Naive Bayes)
Evidential distance-based
(nearest-neighbor)
PEBLS
(nearest-neighbor)
mlp
(Neural Network)
tiberius
(Neural Network)
svmlight
(Support Vector Machine)
Association Analysis
FIMI Repository of Algorithms
Apriori, Eclat, and FP Growth
ARTool
ARMADA
(Association rule mining in Matlab)
Tree Mining, Closed Itemsets, Sequential Pattern Mining
Tree Mining, Closed Itemsets, Sequential Pattern Mining
PAFI
Cluster Analysis
CLUTO
Open Source Clustering Software
Model-based Clustering
Online software for Clustering
Anomaly Detection
ORCA
(distance based)
Regression
Regression routines
Data Preprocessing
Feature Selection
Isomap (Dimensionality Reduction - in Matlab)
B. Data Sets
IDS data sets
Data Sets for Data Mining
Competition Data Set
UCI Machine learning repository
Quest data repository
KDNuggets