Package 'niarules' reference manual

Title:	Numerical Association Rule Mining using Population-Based Nature-Inspired Algorithms
Description:	Framework is devoted to mining numerical association rules through the utilization of nature-inspired algorithms for optimization. Drawing inspiration from the 'NiaARM' 'Python' and the 'NiaARM' 'Julia' packages, this repository introduces the capability to perform numerical association rule mining in the R programming language. Fister Jr., Iglesias, Galvez, Del Ser, Osaba and Fister (2018) <doi:10.1007/978-3-030-03493-1_9>.
Authors:	Iztok Jr. Fister [aut, cre, cph]
Maintainer:	Iztok Jr. Fister <[email protected]>
License:	MIT + file LICENSE
Version:	0.2.0
Built:	2025-03-04 13:42:19 UTC
Source:	https://github.com/firefly-cpp/niarules

Add an attribute to the "rule" list.

Description

This function adds an attribute to the existing list.

Usage

add_attribute(rules, name, type, border1, border2, value)
add_attribute(rules, name, type, border1, border2, value)

Arguments

`rules`	The current rules list.
`name`	The name of the feature in the rule.
`type`	The type of the feature in the rule.
`border1`	The first border value in the rule.
`border2`	The second border value in the rule.
`value`	The value associated with the rule.

Value

The updated rules list.

Examples

rules <- list()
new_rules <- add_attribute(rules, "feature1", "numerical", 0.2, 0.8, "EMPTY")

rules <- list()
new_rules <- add_attribute(rules, "feature1", "numerical", 0.2, 0.8, "EMPTY")

Build rules based on a candidate solution.

Description

This function takes a candidate solution vector and a features list and builds rule.

Usage

build_rule(solution, features)
build_rule(solution, features)

Arguments

`solution`	The solution vector.
`features`	The features list.

Value

A rule.

Calculate the border value based on feature information and a given value.

Description

This function calculates the border value for a feature based on the feature information and a given value.

Usage

calculate_border(feature_info, value)
calculate_border(feature_info, value)

Arguments

`feature_info`	Information about the feature.
`value`	The value to calculate the border for.

Value

The calculated border value.

Examples

feature_info <- list(type = "numerical", lower_bound = 0, upper_bound = 1)
border_value <- calculate_border(feature_info, 0.5)

feature_info <- list(type = "numerical", lower_bound = 0, upper_bound = 1)
border_value <- calculate_border(feature_info, 0.5)

Calculate the fitness of an association rule.

Description

This function calculates the fitness of an association rule using support and confidence.

Usage

calculate_fitness(supp, conf)
calculate_fitness(supp, conf)

Arguments

`supp`	The support of the association rule.
`conf`	The confidence of the association rule.

Value

The fitness of the association rule.

Calculate the selected category based on a value and the number of categories.

Description

This function calculates the selected category based on a given value and the total number of categories.

Usage

calculate_selected_category(value, num_categories)
calculate_selected_category(value, num_categories)

Arguments

`value`	The value to calculate the category for.
`num_categories`	The total number of categories.

Value

The calculated selected category.

Examples

selected_category <- calculate_selected_category(0.3, 5)

selected_category <- calculate_selected_category(0.3, 5)

Check if the attribute conditions are satisfied for an instance.

Description

This function checks if the attribute conditions specified in the association rule are satisfied for a given instance row.

Usage

check_attribute(attribute, instance_row)
check_attribute(attribute, instance_row)

Arguments

`attribute`	An attribute with type and name information.
`instance_row`	A row representing an instance in the dataset.

Value

TRUE if conditions are satisfied, FALSE otherwise.

Calculate the cut point for an association rule.

Description

This function calculates the cut point, denoting which part of the vector belongs to the antecedent and which to the consequent of the mined association rule.

Usage

cut_point(sol, num_attr)
cut_point(sol, num_attr)

Arguments

`sol`	The cut value from the solution vector.
`num_attr`	The number of attributes in the association rule.

Value

The cut point value.

Implementation of Differential Evolution metaheuristic algorithm.

Description

This function uses Differential Evolution, a stochastic population-based optimization algorithm, to find the optimal numerical association rule.

Usage

differential_evolution(
  d = 10,
  np = 10,
  f = 0.5,
  cr = 0.9,
  nfes = 1000,
  features,
  data,
  is_time_series = FALSE
)
differential_evolution(
  d = 10,
  np = 10,
  f = 0.5,
  cr = 0.9,
  nfes = 1000,
  features,
  data,
  is_time_series = FALSE
)

Arguments

`d`	Dimension of the problem (default: 10).
`np`	Population size (default: 10).
`f`	The differential weight, controlling the amplification of the difference vector (default: 0.5).
`cr`	The crossover probability, determining the probability of a component being replaced (default: 0.9).
`nfes`	The maximum number of function evaluations (default: 1000).
`features`	A list containing information about features, including type and bounds.
`data`	A data frame representing instances in the dataset.
`is_time_series`	A boolean indicating whether the dataset is time series.

Value

A list containing the best solution, its fitness value, and the number of function evaluations and list of identified association rules.

References

Storn, R., & Price, K. (1997). "Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces." Journal of Global Optimization, 11(4), 341–359. doi:10.1023/A:1008202821328

Evaluate a candidate solution, with optional time series filtering.

Description

This function evaluates the fitness of an association rule using support and confidence. If time series data is used, it restricts evaluation to the specified time range.

Usage

evaluate(solution, features, instances, is_time_series = FALSE)
evaluate(solution, features, instances, is_time_series = FALSE)

Arguments

`solution`	A vector representing a candidate solution.
`features`	A list containing information about features.
`instances`	A data frame representing dataset instances.
`is_time_series`	A boolean flag indicating if time series filtering is required.

Value

A list containing fitness and identified rules.

References

Fister, I., Iglesias, A., Galvez, A., Del Ser, J., Osaba, E., & Fister, I. (2018). "Differential evolution for association rule mining using categorical and numerical attributes." In Intelligent Data Engineering and Automated Learning–IDEAL 2018: 19th International Conference, Madrid, Spain, November 21–23, 2018, Proceedings, Part I (pp. 79-88). Springer International Publishing. doi:10.1007/978-3-030-03496-2_9

Fister Jr, I., Podgorelec, V., & Fister, I. (2021). "Improved nature-inspired algorithms for numeric association rule mining." In Intelligent Computing and Optimization: Proceedings of the 3rd International Conference on Intelligent Computing and Optimization 2020 (ICO 2020) (pp. 187-195). Springer International Publishing. doi:10.1007/978-3-030-68154-8_19

Extract feature information from a dataset, excluding timestamps.

Description

This function analyzes the given dataset and extracts information about each feature.

Usage

extract_feature_info(data, timestamp_col = "timestamp")
extract_feature_info(data, timestamp_col = "timestamp")

Arguments

`data`	The dataset to analyze.
`timestamp_col`	Optional. The name of the timestamp column to exclude from features.

Value

A list containing information about each feature, including type and bounds/categories.

Get the position of a feature.

Description

This function returns the position of a feature in the vector, considering the type of the feature.

Usage

feature_position(features, feature)
feature_position(features, feature)

Arguments

`features`	The features list.
`feature`	The name of the feature to find.

Value

The position of the feature.

Examples

features <- list(
  feature1 = list(type = "numerical"),
  feature2 = list(type = "categorical"),
  feature3 = list(type = "numerical")
)
position <- feature_position(features, "feature2")

features <- list(
  feature1 = list(type = "numerical"),
  feature2 = list(type = "categorical"),
  feature3 = list(type = "numerical")
)
position <- feature_position(features, "feature2")

Fix Borders of a Numeric Vector

Description

This function ensures that all values greater than 1.0 are set to 1.0, and all values less than 0.0 are set to 0.0.

Usage

fix_borders(vector)
fix_borders(vector)

Arguments

vector

A numeric vector to be processed.

Value

A numeric vector with borders fixed.

Format Rule Parts

Description

This function formats the parts of an association rule into a string.

Usage

format_rule_parts(parts)
format_rule_parts(parts)

Arguments

parts

A list containing parts of an association rule.

Value

A formatted string representing the rule parts.

Map solution boundaries to time series instances.

Description

This function maps the lower and upper bounds of the solution vector to a subset of the dataset.

Usage

map_to_ts(lower, upper, instances)
map_to_ts(lower, upper, instances)

Arguments

`lower`	The lower bound in [0, 1].
`upper`	The upper bound in [0, 1].
`instances`	The full dataset.

Value

A list with 'low', 'up', and 'filtered_instances'.

Implementation of Particle Swarm Optimization (PSO) metaheuristic algorithm.

Description

This function uses PSO, a stochastic population-based optimization algorithm, to find the optimal numerical association rule.

Usage

particle_swarm_optimization(
  d = 10,
  np = 10,
  w = 0.7,
  c1 = 1.5,
  c2 = 1.5,
  nfes = 1000,
  features,
  data,
  is_time_series = FALSE
)
particle_swarm_optimization(
  d = 10,
  np = 10,
  w = 0.7,
  c1 = 1.5,
  c2 = 1.5,
  nfes = 1000,
  features,
  data,
  is_time_series = FALSE
)

Arguments

`d`	Dimension of the problem (default: 10).
`np`	Population size (default: 10).
`w`	Inertia weight (default: 0.7).
`c1`	Cognitive coefficient (default: 1.5).
`c2`	Social coefficient (default: 1.5).
`nfes`	The maximum number of function evaluations (default: 1000).
`features`	A list containing information about features, including type and bounds.
`data`	A data frame representing instances in the dataset.
`is_time_series`	A boolean indicating whether the dataset is time series.

Value

A list containing the best solution, its fitness value, and the number of function evaluations and list of identified association rules.

References

Kennedy, J., & Eberhart, R. (1995). "Particle swarm optimization." Proceedings of ICNN'95 - International Conference on Neural Networks, 4, 1942–1948. IEEE. doi:10.1109/ICNN.1995.488968

Print Numerical Association Rules

Description

This function prints association rules including antecedent, consequence, support, confidence, and fitness. For time series datasets, it also includes the start and end timestamps instead of indices.

Usage

print_association_rules(rules, is_time_series = FALSE, timestamps = NULL)
print_association_rules(rules, is_time_series = FALSE, timestamps = NULL)

Arguments

`rules`	A list containing association rules.
`is_time_series`	A boolean flag indicating if time series information should be included.
`timestamps`	A vector of timestamps corresponding to the time series data.

Value

Prints the association rules.

Print feature information extracted from a dataset.

Description

This function prints the information extracted about each feature.

Usage

print_feature_info(feature_info)
print_feature_info(feature_info)

Arguments

feature_info

The list containing information about each feature.

Value

A message is printed to the console for each feature, providing information about the feature's type, and additional details such as lower and upper bounds for numerical features, or categories for categorical features. No explicit return value is generated.

Calculate the dimension of the problem, excluding timestamps.

Description

Calculate the dimension of the problem, excluding timestamps.

Usage

problem_dimension(feature_info, is_time_series = FALSE)
problem_dimension(feature_info, is_time_series = FALSE)

Arguments

`feature_info`	A list containing information about each feature.
`is_time_series`	Boolean indicating if time series data is present.

Value

The calculated dimension based on the feature types.

Read a CSV Dataset

Description

Reads a dataset from a CSV file and optionally parses a timestamp column.

Usage

read_dataset(
  dataset_path,
  timestamp_col = "timestamp",
  timestamp_formats = c("%d/%m/%Y %H:%M:%S", "%H:%M:%S %d/%m/%Y")
)
read_dataset(
  dataset_path,
  timestamp_col = "timestamp",
  timestamp_formats = c("%d/%m/%Y %H:%M:%S", "%H:%M:%S %d/%m/%Y")
)

Arguments

`dataset_path`	A string specifying the path to the CSV file.
`timestamp_col`	A string specifying the timestamp column name (default: '"timestamp"').
`timestamp_formats`	A vector of date-time formats to try for parsing timestamps.

Value

A data frame containing the dataset.

Simple Random Search

Description

This function generates a vector of random solutions for a specified length.

Usage

rs(candidate_len)
rs(candidate_len)

Arguments

candidate_len

The length of the vector of random solutions.

Value

A vector of random solutions between 0 and 1.

Examples

candidate_len <- 10
random_solutions <- rs(candidate_len)
print(random_solutions)

candidate_len <- 10
random_solutions <- rs(candidate_len)
print(random_solutions)

Calculate support and confidence for an association rule.

Description

This function calculates the support and confidence for the given antecedent and consequent in the dataset instances.

Usage

supp_conf(antecedent, consequent, instances, features)
supp_conf(antecedent, consequent, instances, features)

Arguments

`antecedent`	The antecedent part of the association rule.
`consequent`	The consequent part of the association rule.
`instances`	A data frame representing instances in the dataset.
`features`	A list containing information about features, including type and bounds.

Value

A list containing support and confidence values.

Write Association Rules to CSV file

Description

This function writes association rules to a CSV file. For time series datasets, it also includes start and end timestamps instead of indices.

Usage

write_association_rules_to_csv(
  rules,
  file_path,
  is_time_series = FALSE,
  timestamps = NULL
)
write_association_rules_to_csv(
  rules,
  file_path,
  is_time_series = FALSE,
  timestamps = NULL
)

Arguments

`rules`	A list of association rules.
`file_path`	The file path for the CSV output.
`is_time_series`	A boolean flag indicating if time series information should be included.
`timestamps`	A vector of timestamps corresponding to the time series data.

Value

No explicit return value. The function writes association rules to a CSV file.

Package 'niarules'

Help Index

Add an attribute to the "rule" list.

Description

Usage

Arguments

Value

Examples

Build rules based on a candidate solution.

Description

Usage

Arguments

Value

Calculate the border value based on feature information and a given value.

Description

Usage

Arguments

Value

Examples

Calculate the fitness of an association rule.

Description

Usage

Arguments

Value

Calculate the selected category based on a value and the number of categories.

Description

Usage

Arguments

Value

Examples

Check if the attribute conditions are satisfied for an instance.

Description

Usage

Arguments

Value

Calculate the cut point for an association rule.

Description

Usage

Arguments

Value

Implementation of Differential Evolution metaheuristic algorithm.

Description

Usage

Arguments

Value

References

Evaluate a candidate solution, with optional time series filtering.

Description

Usage

Arguments

Value

References

Extract feature information from a dataset, excluding timestamps.

Description

Usage

Arguments

Value

Get the position of a feature.

Description

Usage

Arguments

Value

Examples

Fix Borders of a Numeric Vector

Description

Usage

Arguments

Value

Format Rule Parts

Description

Usage

Arguments

Value

Map solution boundaries to time series instances.

Description

Usage

Arguments

Value

Implementation of Particle Swarm Optimization (PSO) metaheuristic algorithm.

Description