|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.datagenerators.DataGenerator
weka.datagenerators.ClusterGenerator
weka.datagenerators.clusterers.SubspaceCluster
public class SubspaceCluster
A data generator that produces data points in hyperrectangular subspace clusters.
Valid options are:-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-a <num> The number of attributes (default 1).
-c Class Flag, if set, the cluster is listed in extra attribute.
-b <range> The indices for boolean attributes.
-m <range> The indices for nominal attributes.
-P <num> The noise rate in percent (default 0.0). Can be between 0% and 30%. (Remark: The original algorithm only allows noise up to 10%.)
-C <cluster-definition> A cluster definition of class 'SubspaceClusterDefinition' (definition needs to be quoted to be recognized as a single argument).
Options specific to weka.datagenerators.clusterers.SubspaceClusterDefinition:
-A <range> Generates randomly distributed instances in the cluster.
-U <range> Generates uniformly distributed instances in the cluster.
-G <range> Generates gaussian distributed instances in the cluster.
-D <num>,<num> The attribute min/max (-A and -U) or mean/stddev (-G) for the cluster.
-N <num>..<num> The range of number of instances per cluster (default 1..50).
-I Uses integer instead of continuous values (default continuous).
Field Summary | |
---|---|
static int |
CONTINUOUS
cluster subtype: continuous |
static int |
GAUSSIAN
cluster type: gaussian |
static int |
INTEGER
cluster subtype: integer |
static Tag[] |
TAGS_CLUSTERSUBTYPE
the tags for the cluster types |
static Tag[] |
TAGS_CLUSTERTYPE
the tags for the cluster types |
static int |
TOTAL_UNIFORM
cluster type: total uniform |
static int |
UNIFORM_RANDOM
cluster type: uniform/random |
Constructor Summary | |
---|---|
SubspaceCluster()
initializes the generator, sets the number of clusters to 0, since user has to specify them explicitly |
Method Summary | |
---|---|
java.lang.String |
clusterDefinitionsTipText()
Returns the tip text for this property |
Instances |
defineDataFormat()
Initializes the format for the dataset produced. |
Instance |
generateExample()
Generate an example of the dataset. |
Instances |
generateExamples()
Generate all examples of the dataset. |
java.lang.String |
generateFinished()
Compiles documentation about the data generation after the generation process |
java.lang.String |
generateStart()
Compiles documentation about the data generation before the generation process |
ClusterDefinition[] |
getClusterDefinitions()
returns the currently set clusters |
double |
getNoiseRate()
Gets the percentage of noise set. |
int[] |
getNumValues()
returns array that stores the number of values for a nominal attribute. |
java.lang.String[] |
getOptions()
Gets the current settings of the datagenerator. |
boolean |
getSingleModeFlag()
Gets the single mode flag. |
java.lang.String |
globalInfo()
Returns a string describing this data generator. |
boolean |
isBoolean(int index)
Returns true if attribute is boolean |
boolean |
isNominal(int index)
Returns true if attribute is nominal |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] args)
Main method for testing this class. |
java.lang.String |
noiseRateTipText()
Returns the tip text for this property |
java.lang.String |
numAttributesTipText()
Returns the tip text for this property |
void |
setClusterDefinitions(ClusterDefinition[] value)
sets the clusters to use |
void |
setNoiseRate(double newNoiseRate)
Sets the percentage of noise set. |
void |
setNumAttributes(int numAttributes)
Sets the number of attributes the dataset should have. |
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object. |
Methods inherited from class weka.datagenerators.ClusterGenerator |
---|
booleanColsTipText, classFlagTipText, getBooleanCols, getClassFlag, getNominalCols, getNumAttributes, nominalColsTipText, setBooleanCols, setBooleanIndices, setClassFlag, setNominalCols, setNominalIndices |
Methods inherited from class weka.datagenerators.DataGenerator |
---|
debugTipText, defaultOutput, formatTipText, getDatasetFormat, getDebug, getOutput, getRandom, getRelationName, getSeed, makeData, outputTipText, randomTipText, relationNameTipText, seedTipText, setDatasetFormat, setDebug, setOutput, setRandom, setRelationName, setSeed |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int UNIFORM_RANDOM
public static final int TOTAL_UNIFORM
public static final int GAUSSIAN
public static final Tag[] TAGS_CLUSTERTYPE
public static final int CONTINUOUS
public static final int INTEGER
public static final Tag[] TAGS_CLUSTERSUBTYPE
Constructor Detail |
---|
public SubspaceCluster()
Method Detail |
---|
public java.lang.String globalInfo()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class ClusterGenerator
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-a <num> The number of attributes (default 1).
-c Class Flag, if set, the cluster is listed in extra attribute.
-b <range> The indices for boolean attributes.
-m <range> The indices for nominal attributes.
-P <num> The noise rate in percent (default 0.0). Can be between 0% and 30%. (Remark: The original algorithm only allows noise up to 10%.)
-C <cluster-definition> A cluster definition of class 'SubspaceClusterDefinition' (definition needs to be quoted to be recognized as a single argument).
Options specific to weka.datagenerators.clusterers.SubspaceClusterDefinition:
-A <range> Generates randomly distributed instances in the cluster.
-U <range> Generates uniformly distributed instances in the cluster.
-G <range> Generates gaussian distributed instances in the cluster.
-D <num>,<num> The attribute min/max (-A and -U) or mean/stddev (-G) for the cluster.
-N <num>..<num> The range of number of instances per cluster (default 1..50).
-I Uses integer instead of continuous values (default continuous).
setOptions
in interface OptionHandler
setOptions
in class ClusterGenerator
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class ClusterGenerator
DataGenerator.removeBlacklist(String[])
public void setNumAttributes(int numAttributes)
setNumAttributes
in class ClusterGenerator
numAttributes
- the new number of attributespublic java.lang.String numAttributesTipText()
numAttributesTipText
in class ClusterGenerator
public double getNoiseRate()
public void setNoiseRate(double newNoiseRate)
newNoiseRate
- new percentage of noisepublic java.lang.String noiseRateTipText()
public ClusterDefinition[] getClusterDefinitions()
public void setClusterDefinitions(ClusterDefinition[] value) throws java.lang.Exception
value
- the clusters do use
java.lang.Exception
- if clusters are not the correct classpublic java.lang.String clusterDefinitionsTipText()
public boolean getSingleModeFlag()
getSingleModeFlag
in class DataGenerator
public Instances defineDataFormat() throws java.lang.Exception
defineDataFormat
in class DataGenerator
java.lang.Exception
- data format could not be definedDataGenerator.defaultRelationName()
public boolean isBoolean(int index)
index
- of the attribute
public boolean isNominal(int index)
index
- of the attribute
public int[] getNumValues()
public Instance generateExample() throws java.lang.Exception
generateExample
in class DataGenerator
java.lang.Exception
- if format not defined or generating public Instances generateExamples() throws java.lang.Exception
generateExamples
in class DataGenerator
java.lang.Exception
- if format not definedpublic java.lang.String generateFinished() throws java.lang.Exception
generateFinished
in class DataGenerator
java.lang.Exception
- no input structure has been definedpublic java.lang.String generateStart()
generateStart
in class DataGenerator
public static void main(java.lang.String[] args)
args
- should contain arguments for the data producer:
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |