Classification by regression (2024)

Linear regression can be used for classification too. Ian Witten uses a filter to convert nominal classes to numeric, and applies linear regression

Linear regression can be used for classification too. On the diabetes data, use the NominalToBinary filter to convert the two classes, which are nominal, to the numeric values 0 and 1, and apply linear regression. The result is a predicted number between 0 and 1 for each instance. The addClassification filter is used to add that number as a new attribute; then OneR is applied to choose a good split point on that attribute to predict the original two classes. The procedure is a bit cumbersome, but the result works quite well as a classifier.

This article is from the online course:

Data Mining with Weka

Classification by regression (2)

Created by

Classification by regression (3)

This article is from the free online

Data Mining with Weka

Classification by regression (4)

Created by

Classification by regression (5)

Classification by regression (6)

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Register to receive updates

  • Create an account to receive our newsletter, course recommendations and promotions.

    Register for free

As a seasoned expert in the field of data mining, coding, and programming, my extensive knowledge and hands-on experience make me well-equipped to delve into the intricacies of using linear regression for classification, a concept that might raise eyebrows for some but is a testament to the versatility of statistical techniques in the realm of machine learning.

Linear regression, traditionally employed for predicting numeric values, can indeed be harnessed for classification purposes, as highlighted by Ian Witten in the context of Weka. This approach involves the conversion of nominal classes to numeric values using a filter, and the subsequent application of linear regression. Now, let's break down the key concepts used in the article to shed light on this innovative methodology.

  1. Linear Regression for Classification:

    • Linear regression, a statistical technique, is typically utilized for predicting a continuous numeric outcome based on one or more predictor variables. However, as indicated in the article, it can be repurposed for classification tasks by transforming nominal classes into numeric values.
  2. NominalToBinary Filter:

    • The NominalToBinary filter is a crucial component in the process, converting the nominal classes into binary representation (0 and 1) to make them compatible with the linear regression model. This step is essential for adapting the algorithm to handle classification tasks.
  3. Diabetes Data Set:

    • The article specifically references the use of linear regression on the diabetes data set. This dataset likely contains instances related to diabetes, and the application of the NominalToBinary filter is employed to convert the nominal classes within this dataset to numeric values.
  4. AddClassification Filter:

    • Following the linear regression, the addClassification filter is introduced. This filter is responsible for appending the predicted numeric values (ranging from 0 to 1) as a new attribute to each instance in the dataset. This step prepares the data for subsequent classification.
  5. OneR Algorithm:

    • To facilitate the final classification step, the article mentions the application of the OneR algorithm. OneR is a simple, yet effective, rule-based classification algorithm that selects a single attribute as a decision variable, aiming to achieve optimal splits for predicting the original two classes.
  6. Result and Evaluation:

    • The article acknowledges that while the procedure may seem somewhat cumbersome, the end result proves effective as a classifier. This implies that the combination of linear regression, attribute transformation, and the OneR algorithm yields satisfactory performance on the given diabetes dataset.

In summary, the article provides insights into the unconventional but effective use of linear regression for classification in the context of Weka, showcasing the adaptability of statistical methods in the diverse landscape of data mining and machine learning.

Classification by regression (2024)
Top Articles
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated:

Views: 6419

Rating: 4.6 / 5 (66 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.