Data Science & Big Data Research Paper – PhD (No Plagiarism)

We have discussed PPT with professor and he said the below points

1. Need to compare our PPT with similar scholar paper and mention why are PPT is unique to the other scholar paper(Add a Slide with a topic which is unique or additional research compare to the other similar scholar paper)

Don't use plagiarized sources. Get Your Custom Essay on
Data Science & Big Data Research Paper – PhD (No Plagiarism)
Just from $13/Page
Order Essay

2. Take a scenario and explain atleast 4 tools as a example how will use it in the scenario

3. We need a research paper to the same topic as of above PPT with below template format(just elaborate a bit with same details)

Research Paper Format

– Minimum 12 pages with APA format

– Cover Page

– Abstract

– Table of Content

– Discussion – Maint Content

– Justification and explanation

– Conclusions

– Citations/ References

Note: As Discussed please give a new PPT as per the requirement.

2

Running Head:

BIG DATA PROCESSING OF SOFTWARE AND TOOLS

2
BIG DATA PROCESSING OF SOFTWARE AND TOOLS

University of the Cumberlands

Big Data Processing of Software and Tools

Data Science & Big Data Analytics

ITS 836-21 Group-1

Prof: Gamini Bulumulle

Date submitted: 02/23/2020

Submitted By:

Table of contents

Abstract

………………………………………………………………………………………………………..3

Executive summary

………………………………………………………………………………………..4

Big data analytics software……………………………………………………………………………..6

·

Apache Hadoop

……………………………………………………………………………..6

· CDH……………………………………………………………………………………7

· Casandra……………………………………………………………………………………….7

·

Knime

…………………………………………………………………………………………..7

·

Datawrapper

………………………………………………………………………………….8

·

MongoDB

……………………………………………………………………………………..8

·

Lumify

………………………………………………………………………………………….9

·

HPCC

……………………………………………………………………………………………9

·

Storm

…………………………………………………………………………………………..10

·

Apache SAMOA

…………………………………………………………………………..10

·

Talend

………………………………………………………………………………………….10

·

RapidMiner

…………………………………………………………………………………..11

Analyzing the data sets using R language…………………………………………..12

Conclusion……………………………………………………………………………………………………12

References

…………………………………………………………………………………………………….14

Abstract

The concept of big data analytics has been used over the years and most companies have embraced the idea, to harness data that is being used in their day to day company routines. Companies can apply analytics and receive huge benefits from it, back in the 1950s, companies were using big data in in terms of spreadsheet analysis. This was a crude form of big data analytics used to reveal small bits of data and data patterns. Nowadays companies use big data analytics software to handle huge chunks of data because it has a variety of benefits to businesses. Some of the advantages of big data analytics include: the speed in handling data, efficiency and productivity. Many businesses prefer to accumulate huge data and later run analytics of the data to be used for future references in the company. Big data analytics ensures that businesses make the right choices when it comes to handling data in the organization. The ability of big data to work quicker and remain efficient gives companies the advantage that they did not have previously. This research paper will focus majorly on the big data analytics software and their benefits to an organization.

Keywords: Big data, analysis, spreadsheet, efficiency, organization

Executive summary

Big data analysis software gives organizations the ability to get new ideas based the results of the analysis. It then encourages more effective and efficient business ideas, increased benefits, increased proficiency, and happy clients. In a research by Tom Davenport more than fifty companies were analysed to see how they employed the use of big data analysis software (Chandarana, P., & Vijayalakshmi, M., 2014, April). The conclusions that were made from the research was that there were decreased costs when it comes to data analysis. The companies that were using big data analytics software such as Apache Hadoop and a cloud based analysis had reduced costs when it comes to storage and analysis of data, these companies also had an upper hand in making business decisions. The research also proved that the companies that were making use of bid data analysis software were quicker and had better dynamics when analysing data. With in memory analysis and Hadoop, combined with the ability to analyse new collections of data, companies can be able to analyse data with a considerable speed and come into conclusion based on the results of the analyses.

With the use of big data analytics software, there is an increased ability to measure the needs of customers and know what they need. Davenports research brings emphasis on bid data analytics, there is an increased understanding of the needs of the clients and better ways to address these issues. Nowadays, many organizations widely use big data analytics to make a big difference in the market. With open source big data analytics software, the most valuable sections of the organizations are secure, expenses are reduced. Hadoop is one of the best big data analytics software that most business currently use and many vendors currently employ the services of Hadoop.

Hypothetically, a company may be faced with the need to do market analysis in order to ascertain the trends in the market. This scenario calls for the use of big data to help in the marketing trend analysis. Big data software such as Hadoop, Apache SAMOA, Casandra and Datawrapper can be used to analyse the data and come up with an idea of what the market looks like. All the software listed above play a role when it comes to market trend analysis. For example, Hadoop will be used to analyse huge data sets and help in giving out information that relates to the future trends in that line of business. Datawrapper will help the organization to perceive the type of information to be analysed for market trends.

Big data analytics software

There are many things that come to the limelight when it comes to the use of big data analytics in the modern world. Some of the things that come to mind when it comes to big data include what analysis software are to be used, how big the data indices are, what is the normal data yield within an organization and so on (Bhosale, H. S., & Gadekar, D. P., 2014). Big data analysis can be broadly classified in the following ways: improvement stages, advanced devices, as analysis instruments, for data analytics and other analysis devices. Some of the software used for big data analytics include the following:

Apache Hadoop

This software is used in big data analytics to analyse huge chunks of data and grouped file systems. Hadoop forms a part of big data and MapReduce model of programming. It is an open source software that uses Java programming to give a cross functional support and analysis of data. It is one of the widely used analytics software. Research has it that more that fifty Fortune companies use Hadoop in their data analysis systems. Some of the noteworthy companies that use Hadoop include Facebook, Intel, Amazon Web services, Hortonworks, IBM statistics, Microsoft and many more.

The are many benefits that comes with using Hadoop and some of them are listed below: the entire system of Hadoop has a distributed file system which has the capacity to carry all kinds of data such as pictures, XML, JSON, Hadoop is also very valuable when it comes to R&D uses, the software also has an advantage when it comes to access to data, the tool is highly versatile and easily accessible when it comes to using a system of computers. However, there are many disadvantages that come with using Hadoop. Some of the downfalls include the issue of repetition and a reduced functionality when it comes to I/O activities.

CDH (Cloudera Distribution for Hadoop) software

CDH focuses on big merchantry matriculation arrangements of that innovation. It is a thoroughly open-source and has a self-ruling stage plagiarism that includes Apache Hadoop, Apache Spark, Apache Impala, and some more. It permits you to gather, process, oversee, find, model, and circulate widespread information. Benefits of using CDH software: Comprehensive dissemination, Cloudera Manager oversees the Hadoop group well indeed, Easy usage, Less ramified organization, Upper security and wardship. Disadvantages of using CDH software include: Few muddling UI highlights like outlines on the CM administration, Multiple prescribed methodologies for establishment sound befuddling and, in any case, the Licensing forfeit on a for every hub premise is truly costly.

Cassandra

Apache Cassandra is liberated from forfeit and open-source sparse NoSQL DBMS ripened to oversee immense volumes of information spread over various item servers, conveying upper accessibility. It utilizes CQL (Cassandra Structure Language) to cooperate with the database. A portion of the prominent organizations utilizing Cassandra incorporates Accenture, American Express, Facebook, General Electric, Honeywell, Yahoo, and so on. Benefits of using big Apache Casandra include: No single purpose of disappointment, Handles big data rapidly, Log-organized capacity, Automated replication, Linear tensility and Simple Ring diamond. Disadvantages of using Casandra include: Requires some spare endeavours in investigating and upkeep, Clustering could have been improved and Row-level locking highlight isn’t there.

Knime

KNIME represents Konstanz Information Miner which is an open-source device that is used for Enterprise detailing, incorporation, analytics, CRM, information mining, information analysis, content mining, and merchantry insight. It underpins Linux, OS X, and Windows working frameworks. It very well may be considered as a decent option in unrelatedness to SAS. A portion of the top organizations utilizing Knime incorporates Comcast, Johnson and Johnson, Canadian Tire, and so forth. Benefits of using KNIME include: Simple ETL activities, it integrates very well with variegated innovations and dialects, Rich numbering set, highly usable and sorted out work processes, automates an unconfined deal of transmission work, no steadiness issues and Easy to set up. Disadvantages of using KNIME software for data analytics: Data dealing with a limit can be improved, it occupies nearly the whole RAM and it Could have permitted joining with diagram databases.

Datawrapper

Datawrapper is an open-source stage for information perception that guides its clients to produce basic, word-for-word and embeddable outlines rapidly. Its significant clients are newsrooms that are spread everywhere throughout the world. A portion of the names incorporates The Times, Fortune, Mother Jones, Bloomberg, Twitter and so forth. Benefits of using Datawrapper for big data analytics: The device is well tending of. Works very well on all sorts of gadgets – versatile, tablet or work area, fully responsive, Fast, Interactive, brings all the diagrams in a single spot, Unconfined customization and fare choices and It requires zero coding. Disadvantages: Limited shading palettes

MongoDB

MongoDB is a NoSQL, report serried database written in C, C #, and JavaScript. It is unviable to utilize and is an open-source device that bolsters variegated working frameworks including Windows Vista (and later forms), OS X (10.7 and later forms), Linux, Solaris, and FreeBSD. Its primary highlights incorporate Aggregation, Adhoc-inquiries, Uses BSON group, Shading, Indexing, Replication, Server-side execution of JavaScript, Schema less, Capped assortment, MongoDB the workbench wardship (MMS), load adjusting and record stockpiling. A portion of the significant clients utilizing MongoDB incorporates Facebook, eBay, MetLife, Google, and so on. Benefits of using MongoDB for big data analytics: Easy to learn, Provides support for various innovations and stages, No hiccups in establishment and support, Reliable and minimal effort. Disadvantages of using MongoDB for big analytics: Limited analytics and Slow for unrepeatable utilization cases.

Lumify

Lumify is a self-ruling and open-source instrument for big data combination/reconciliation, analysis, and representation. Its essential highlights incorporate full-content pursuit, 2D and 3D orchestration perceptions, programmed formats, connect analytics between diagram elements, combined with mapping frameworks, geospatial analysis, sight and sound analytics, a continuous coordinated effort through a lot of undertakings or workspaces. Benefits of using Lumify for big data analytics: Scalable, Secure, supported by a single-minded full-time urging group, Supports the cloud-based condition. Functions admirably with Amazon’s AWS.

HPCC

HPCC represents High-Performance Computing Cluster. This is a finished big data wattle over an uncommonly versatile supercomputing stage. HPCC is likewise alluded to as DAS (Data Analytics Supercomputer). This device was created by LexisNexis Risk Solutions. This workings are written in C and an information-driven programming language knowns as ECL (Enterprise Control Language). It depends on Thor engineering that bolsters information parallelism, pipeline parallelism, and framework parallelism. It is an open-source device and is a decent substitute for Hadoop and some other Big information stages. Benefits of using HPCC for big data analytics: The engineering depends on product processing groups which requite superior, Parallel information preparing, Fast, incredible and profoundly adaptable, supports superior online inquiry applications, and it is Cost-powerful and exhaustive.

Storm

Apache Storm is a cross-stage, conveyed stream handling, and shortcoming tolerant unvarying computational structure. It is self-ruling and open-source. The designers of the tempest incorporate Back type and Twitter. It is written in Clojure and Java. Its engineering depends on tweaked gushes and darts to portray wellsprings of data and controls to indulge cluster, sparse handling of unbounded surges of information. Among many, Groupon, Yahoo, Alibaba, and The Weather Channel are a portion of the well-known organizations that utilization Apache Storm. Benefits of using Apache storm for big data analytics: Reliable at scale, very quick and shortcoming tolerant, Guarantees the handling of information, it has numerous utilization cases – ongoing analytics, log preparing, ETL (Extract-Transform-Load), resulting calculation, conveyed RPC, AI. Disadvantages of using Apache storm for big data analytics: Difficult to learn and utilize, Difficulties with investigating, and the use of Native Scheduler and Nimbus wilt bottlenecks.

Apache SAMOA

SAMOA represents Scalable Advanced Massive Online Analysis. It is an open-source stage for big data stream mining and AI. It permits you to make sparse spilling AI (ML) calculations and run them on numerous DSPEs (appropriated stream preparing motors). Apache SAMOA’s nearest elective is a BigML device. Benefits of using Apache SAMOA for big data analytics: Simple and witty to utilize, Fast and versatile, True continuous spilling and it has a Write Once Run Anywhere (WORA) engineering.

Talend

Talend Big information coordination items include: Open studio for Big information: It goes under self-ruling and open-source permit. Its parts and connectors are Hadoop and NoSQL. It gives network perpetuate as it were, Big information stage: It accompanies a client-based membership permit. Its parts and connectors are MapReduce and Spark. It gives Web, email, and telephone support and Real-time big data stage: It goes under a client-based membership permit. Its parts and connectors incorporate Spark gushing, Machine learning, and IoT. It gives Web, email, and telephone support. Benefits of using Talend for big data analytics: Streamlines ETL and ELT for Big information, Accomplish the speed and size of sparkle, accelerates your transition to continuous, handles numerous information sources and It provides various connectors under one rooftop, which thus will permit you to redo the wattle equal to your needs. Disadvantages of using Talend for big data analytics: Community valuables could have been something more, could have an improved and simple to utilize interface and Difficult to add a custom segment to the palette.

RapidMiner

RapidMiner is a cross-stage workings that offers a coordinated domain for information science, AI and prescient analytics. It goes under variegated licenses that offer little, medium and huge restrictive versions just as a self-ruling release that takes into consideration 1 legitimate processor and up to 10,000 information columns. Organizations like Hitachi, BMW, Samsung, Airbus, and so along have been utilizing RapidMiner. Benefits of using RapidMiner in big data software analytics: Open-source Java centre, the repletion of wearing whet information science instruments and calculations, the facility of code-discretionary GUI, Integrates well with APIs and cloud, Superb vendee assistance and specialized help. However, while using RapidMiner, Online information administrations ought to be improved.

Analyzing the data sets using R language

Data simulation is the crucial stage in processing raw data to identify and trace certain patterns and generate the reports to enhance the productivity. We have taken some sample data set regarding a computer store, where we did some simulation to show the different type of RAM available in the store and simulated to hard disk prices.

Conclusion

The computerized age has made it simpler for experts to get to the information that would permit you to improve your business execution (Manikandan, S. G., & Ravi, S., 2014). In any case, to use this data, you will require information examination programming that can give you devices for information mining, association, investigation, and perception. Besides, it ought to be furnished with AI and propelled calculations to change your crude information into significant bits of knowledge right away. Along these lines, you can stay aware of business drifts, and even discover approaches to additionally improve your general tasks. In any case, there are a lot of components associated with finding the privilege investigation apparatus for a specific business. From looking at its exhibition to figuring how well it plays with different frameworks, the exploration procedure can be overpowering. In this way, to support you, we have assembled the main items available and surveyed their functionalities and ease of use. Big Data tools help us to store and transform the huge data into analytics to track and understand to predict certain patterns and gain the productivity of the organization. Thusly, it will be simpler for you to decide the most ideal information investigation stage for your tasks.

References

Bhosale, H. S., & Gadekar, D. P. (2014). A review paper on big data and hadoop. International Journal of Scientific and Research Publications, 4(10), 1-7.

Chandarana, P., & Vijayalakshmi, M. (2014, April). Big data analytics frameworks. In 2014 International Conference on Circuits, Systems, Communication and Information Technology Applications (CSCITA) (pp. 430-434). IEEE.

Manikandan, S. G., & Ravi, S. (2014, October). Big data analysis using Apache Hadoop. In 2014 International Conference on IT Convergence and Security (ICITCS) (pp. 1-4). IEEE.

Talia, D. (2013). Clouds for scalable big data analytics. Computer, (5), 98-101.

Allen, G., Campbell, F., & Hu, Y. (2015). Comments on “visualizing statistical models”: Visualizing modern statistical methods for Big Data. Statistical Analysis And Data Mining: The ASA Data Science Journal, 8(4), 226-228. doi: 10.1002/sam.11272

Griffith, D. (1993). Advanced spatial statistics for analysing and visualizing geo-referenced data. International Journal Of Geographical Information Systems, 7(2), 107-123. doi: 10.1080/02693799308901945

What Will You Get?

We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.

Premium Quality

Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.

Experienced Writers

Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.

On-Time Delivery

Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.

24/7 Customer Support

Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.

Complete Confidentiality

Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.

Authentic Sources

We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.

Moneyback Guarantee

Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.

Order Tracking

You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.

image

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

Areas of Expertise

Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.

image

Trusted Partner of 9650+ Students for Writing

From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.

Preferred Writer

Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.

Grammar Check Report

Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.

One Page Summary

You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.

Plagiarism Report

You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.

Free Features $66FREE

  • Most Qualified Writer $10FREE
  • Plagiarism Scan Report $10FREE
  • Unlimited Revisions $08FREE
  • Paper Formatting $05FREE
  • Cover Page $05FREE
  • Referencing & Bibliography $10FREE
  • Dedicated User Area $08FREE
  • 24/7 Order Tracking $05FREE
  • Periodic Email Alerts $05FREE
image

Our Services

Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.

  • On-time Delivery
  • 24/7 Order Tracking
  • Access to Authentic Sources
Academic Writing

We create perfect papers according to the guidelines.

Professional Editing

We seamlessly edit out errors from your papers.

Thorough Proofreading

We thoroughly read your final draft to identify errors.

image

Delegate Your Challenging Writing Tasks to Experienced Professionals

Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!

Check Out Our Sample Work

Dedication. Quality. Commitment. Punctuality

Categories
All samples
Essay (any type)
Essay (any type)
The Value of a Nursing Degree
Undergrad. (yrs 3-4)
Nursing
2
View this sample

It May Not Be Much, but It’s Honest Work!

Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.

0+

Happy Clients

0+

Words Written This Week

0+

Ongoing Orders

0%

Customer Satisfaction Rate
image

Process as Fine as Brewed Coffee

We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.

See How We Helped 9000+ Students Achieve Success

image

We Analyze Your Problem and Offer Customized Writing

We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.

  • Clear elicitation of your requirements.
  • Customized writing as per your needs.

We Mirror Your Guidelines to Deliver Quality Services

We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.

  • Proactive analysis of your writing.
  • Active communication to understand requirements.
image
image

We Handle Your Writing Tasks to Ensure Excellent Grades

We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.

  • Thorough research and analysis for every order.
  • Deliverance of reliable writing service to improve your grades.
Place an Order Start Chat Now
image

Order your essay today and save 30% with the discount code Happy