250 words agree or disagree to each questions
Rapid Miner is a popular analytical software that is used with data mining (Boileve, 2019). There are many things to like about the software, but one that sticks out is the ease at which it is to move from design to production (Boileve, 2019). In contrast, one of the things that is not so good is its ability to integrate to third party applications (Boileve, 2019). R is a very programming language that is commonly used for statistical modeling (Pros and Cons of R, 2019). One of the best things about R is the vast number of packages it has (Pros and Cons of R, 2019). One of the bad aspects is its lack of a competent security feature (Pros and Cons of R, 2019). KNIME is a great product for beginners as there is little to no coding required to complete projects as excel knowledge is enough Cui, 2020). The bad with KNIME is the program is slow moving (Cui, 2020).
Cloud based analytics has its ups and downs as with most everything. It would be prudent for a business to evaluate its needs prior to adoption of cloud-based analytics. Some of the good with cloud-based analytics are the low costs associated with server use, clustering of information happens automatically to distribute data across the available nodes, downtime is limited, faster real-time analysis, and lower up-front costs due to not needing expensive hardware (Delgado, 2013). The bad with cloud-based analytics are outages that may occur could last from hours to days, transferring data takes a lot of bandwidth, we don’t know what we don’t know as cloud analytics is still a relatively new process, and the potential for costs to rise as companies use more and more server space (Delgado, 2013).
Using R for this data mining course is a good option because of the wide variety of packages available to the user (What is R?, n.d.). It is a free open source programming language so there is no licensing needed. R is also versatile and can be paired with many other programming languages like Python (What is R?, n.d.). The most important reason to use R in this data mining course is because R is the bridge language used to connect two different languages in the data mining community (What is R?, n.d.).
Boileve, D. (2019, February 4). Very user-friendly ETL with mining capabilities. Will save you a lot of time with your data. https://www.trustradius.com/reviews/rapidminer-studio-2019-02-04-04-24-32
Cui, I. (2020, July 1). KNIME Review from a daily user. https://www.trustradius.com/reviews/knime-analytics-platform-2020-06-26-14-54-29
Delgado, R. (2013, December 20). The pros and cons of harnessing big data in the cloud. ITProPortal. https://www.itproportal.com/2013/12/20/pros-and-cons-of-big-data-in-the-cloud/.
Pros and Cons of R Programming Language – Unveil the Essential Aspects! DataFlair. (2019, December 31). https://data-flair.training/blogs/pros-and-cons-of-r-programming-language/.
What is R? R. (n.d.). https://www.r-project.org/about.html.
Data mining is a process that goes beyond counting, reporting, and descriptive techniques (Shmueli, Bruce, Yahav, Patel, & Jr, 2018). This process involves finding anomalies, patterns, and correlations within large data sets to predict outcomes. To accomplish these daunting tasks, analyst most employ computer software’s. There are many in the industry, however, for the purpose of this discussion I will concentrate in three of them, R, Python, and SAS.
R is the leading analytics tool in the industry. This program is mainly used for statistics and modeling. This software offers certain advantages as well as disadvantages. Some of the advantages include compatibility, array of packages, platform independence (Pros and Cons of R Programming Language-Unveil the Essential Aspects!, 2021). R is compatible with other programming languages such as C, C++, Java, and Python. Additionally, the software library includes over 10,000 packages and it keeps growing, meaning there is a wide array of commands that enable the analyst to accomplish anything he/she wants. Lastly, R programming works in any platform (windows, Linux, and Mac).
Some of the disadvantages include security, complicated to learn, and speed (Pros and Cons of R Programming Language-Unveil the Essential Aspects!, 2021). Firstly, the lack of security, a feature that is essential in most programming languages such as Python, excludes the software for web applications. Secondly, the complexity of the language makes it hard to learn, especially for those individuals without programming experience. Lastly, this software is much slower than programming languages such as MATLAB and Python.
Python is an immensely popular programming language in data mining. Some of the features that account for this language popularity is its easiness to read, write, and maintain. This software is good at machine learning and can be assembled on any platform such as SQL server, MongoDB database of JSON (Top 10 Data Analytics Tools, 2021). Nonetheless, Python also have some disadvantages. Some of the disadvantages include its slow speed, memory consumption, non-compatibility with mobile technology, and threading (Basel, 2018).
SAS is another great data mining tool. This software is widely used in customer intelligence enabling analysts to profile existing and potential clients. This program can predict behaviors, manage, and optimize communications (Top 10 Data Analytics Tools, 2021). However, it also has certain limitations. Some these disadvantages include price and the limited library when compared to a free software’s such as R.
With the continued growth of Big Data, and consequently data mining tools and techniques, there most be a bridge to integrate all this information together. This link is known as cloud-based analytics. This is not a new concept, however, one that tagged along the creation, implementation, and integration of the cloud technology.
Cloud-based analytics offers many advantages and disadvantages. Some of the great advantages of this technology include the amount and speed of information that can be harvested in short periods of time. Other advantages are real-time analysis and lower up-front costs. Studies have showed that organization can conduct more targeted and personalized analysis to their customers as well as pay a fraction of the costs have, they invested in all the hardware infrastructure to accomplish the latter. Nonetheless, this technology also comes with its downside. Some of the disadvantages include lack of best practices in the industry, costs of data migration and integration, and power outages (Delgado, 2013).
In conclusion, deciding which data mining program is best boils down to what is the analyst trying to accomplish, his/her skills, budget, and hardware. Since R is one the leaders in the industry, I believe that getting a sound foundation in this language will be extremely beneficial. However, as a future analyst, I should not consider R as the ‘go-to’ program and keep adding more languages to my tool bag as times comes along.
Basel, K. (2018, June 20). Python Pros and Cons. Retrieved from Net Guru: https://www.netguru.com/blog/python-pros-and-cons
Delgado, R. (2013, December 20). The Pros and Cons of harnessing Big Data in the Cloud . Retrieved from IT Pro Portal: https://www.itproportal.com/2013/12/20/pros-and-cons-of-big-data-in-the-cloud/
Pros and Cons of R Programming Language-Unveil the Essential Aspects! (2021). Retrieved from Data Flair: https://data-flair.training/blogs/pros-and-cons-of-r-programming-language/
Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Jr, K. C. (2018). What is Data Mining? In G. Shmueli, P. C. Bruce, I. Yahav, N. R. Patel, & K. C. Jr, Data Mining for Business Analytics(p. 59). Hoboken: John Wiley & Sons.
Top 10 Data Analytics Tools. (2021). Retrieved from Pro Shoc: https://www.proschoolonline.com/blog/top-10-data-analytics-tools