R or weka Lab

  
Laboratory I:
         
To download additional .arff data sets go to:
http://www.hakank.org/weka/
or search the Internet for .arff files required
· What’s the difference between a “training set” and a “test set”?
· Why might a pruned decision tree that doesn’t fit the data so well be better than an un-pruned one?
· What’s the first thing that 1R does when making a rule based on a numeric attribute?
· How does 1R avoid overfitting when making a rule based on an enumerated and/or numeric attribute?
· What is the difference between Attribute, Instance and Training set? 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

What      is the difference between ID3 and C4.5?

Use the following learning      schemes to analyze the iris data (in iris.arff): 

  
OneR

– weka.classifiers.OneR
 
Decision table

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

– weka.classifiers.DecisionTable -R
 
C4.5

– weka.classifiers.j48.J48
· Do the decisions made by the classifiers make sense to you? Why?
· What can you say about the accuracy of these classifiers? When classifying iris that has not been used for training? 
· How did each one of the methods perform?

Use the following learning      schemes to analyze the bolts data (bolts.arff without the TIME attribute):      

  
Decision Tree

– weka.classifiers.j48.J48
 
Decision table

– weka.classifiers.DecisionTable -R
 
Linear regression

– weka.classifiers.LinearRegression
 
M5′ 

– weka.classifiers.M5′
· The dataset describes the time needed by a machine to produce and count 20 bolts. (More details can be found in the file containing the dataset.) 
· Analyze the data. What adjustments have the greatest effect on the time to count 20 bolts? 
· According to each classifier, how would you adjust the machine to get the shortest time to count 20 bolts?

Produce      a model for both Weather and Weather.nominal data sets. Which method(s) did you use? What did      the tree(s) look like?

Laboratory II:
 
To download additional .arff data sets go to:
weka data folder for
BreastTumor.arff
http://www.hakank.org/weka/
zoo.arff, wine.arff, bodyfat.arff, sleep.arff, pollution.arff

Use the following learning schemes to analyze the zoo      data (in zoo.arff): 

  
OneR

– weka.classifiers.OneR
 
Decision table

– weka.classifiers.DecisionTable -R
 
C4.5

– weka.classifiers.j48.J48
 
K-means

– weka.clusterers.SimpleKMeans
Try using reduced error pruning for the C4.5. Did it change the produced model? Why? 
For K-means, for the first run, set k=10. Adjust as needed. What was the final number of k? Why?

Use the following learning schemes to analyze the      breast tumor data. 

  
Linear regression

– weka.classifiers.LinearRegression
 
M5′ 

– weka.classifiers.M5′
 
Regression Tree

– weka.classifiers.M5′
 
K-means clustering

– weka.clusterers.SimpleKMeans
A) How many leaves did the Model tree produce? Regression Tree? What happens if you change the pruning factor? 
How many clusters did you choose for the K-means method? Was that a good choice? Did you try a different value for k?
B) Now perform the same analysis on the bodyfat.arff data set.

Use a      k-means clustering technique to analyze the iris data set. What did you      set the k value to be? Try several different values. What was the random seed value?      Experiment with different random seed values. How did changing of these values      influence the produced models?
Produce      a hierarchical clustering (COBWEB) model for iris data. How many clusters did it produce? Why?      Does it make sense? What did you expect?

Change the acuity and cutoff parameters in order to produce a model similar to the one obtained in the book. Use the classes to cluster evaluation – what does that tell you?
Laboratory III:
 
To download additional .arff data sets go to:
http://www.hakank.org/weka/
zoo.arff, wine.arff, soybean.arff, zoo2_x.arff, 
sunburn.arff, disease.arff
8. Use the following learning schemes to compare the training set and 10-fold stratified cross-validation scores of the disease data (in disease.arff): 
  
Decision table

– weka.classifiers.DecisionTable -R
 
C4.5

– weka.classifiers.j48.J48
 
Id3

– weka.clusterers.Id3
A) What does the training set evaluation score tell you? 
B) What does the cross-validation score evaluate? 
C) Which one of these models would you say is the best? Why?
9. Use the following learning schemes to analyze the wine data (in wine.arff). 
  
C4.5

– weka.classifiers.j48.J48
 
Decision List

– weka. classifiers.PART
A) What is the most important descriptor (attribute) in wine.arff?
B) How well were these two schemas able to learn the patterns in the dataset? How would you quantify your answer?
C) Compare the training set and 10-fold cross-validations scores of the two schemas.
D) Would you trust these two models? Did they really learn what is important for proper classification of wine?
E) Which one would you trust more, even if just very slightly?
10. Perform the same analysis of sunburn.arff as in 2. Instead of 10-fold cross-validations use 5-fold.
A)-E) Same as in 2.
F) Why could not we use 10-fold evaluation in this example?
11. Choose one of the following three files: soybean.arff, zoo.arff or zoo2_x.arff and use any two schemas of your choice to build and compare the models.

Fountain Essays
Calculate your paper price
Pages (550 words)
Approximate price: -

Why Work with Us

Top Quality and Well-Researched Papers

We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.

Professional and Experienced Academic Writers

We have a team of professional writers with experience in academic and business writing. Many are native speakers and able to perform any task for which you need help.

Free Unlimited Revisions

If you think we missed something, send your order for a free revision. You have 10 days to submit the order for review after you have received the final document. You can do this yourself after logging into your personal account or by contacting our support.

Prompt Delivery and 100% Money-Back-Guarantee

All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. In case you cannot provide us with more time, a 100% refund is guaranteed.

Original & Confidential

We use several writing tools checks to ensure that all documents you receive are free from plagiarism. Our editors carefully review all quotations in the text. We also promise maximum confidentiality in all of our services.

24/7 Customer Support

Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.

Try it now!

Calculate the price of your order

Total price:
$0.00

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.

Essays

Essay Writing Service

No matter what kind of academic paper you need and how urgent you need it, you are welcome to choose your academic level and the type of your paper at an affordable price. We take care of all your paper needs and give a 24/7 customer care support system.

Admissions

Admission Essays & Business Writing Help

An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.

Reviews

Editing Support

Our academic writers and editors make the necessary changes to your paper so that it is polished. We also format your document by correctly quoting the sources and creating reference lists in the formats APA, Harvard, MLA, Chicago / Turabian.

Reviews

Revision Support

If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.