Essay.rtfd/TXT.rtf
Assignment Brief
Write a 450 word essay on a data science blog post that covers a topic you want to learn.
The internet has countless articles on data science use cases, reports, tutorials, competition entries, etc. Your task is to write a short essay on a post of your choice. The essay should cover a topic you want to learn more about. You must pick an article from one of these outlets:
1. https://medium.com/kaggle-blog
2. https://colah.github.com
2. https://www.analyticsvidhya.com/blog/
Submission Requirements
You will have to submit 2 files:
Your essay in PDF format
Your essay in Word docx format.
The content of these must be identical.
This is an individual assignment. You are not allowed to share your work with others. Your submission will be checked using Turnitin and suspected cases of collusion and/or plagiarism will be reported.
The usual penalties for late submissions apply and will be applied automatically. The automatic lateness penalty will kick in at the time of the designated submission deadline, no exceptions. Please make sure you submit sufficiently long ahead of the deadline, since even one second over the specified deadline will incur the full lateness penalty of 10% per day. See the module handbook for details.
Required content and marking scheme
This assignment is worth 30 % of the overall assessment for this module.
This assignment has a maximum score of 20 marks.
14 marks will be awarded for content, coherence and clarity.
Required content of the essay:
State the purpose of the blog post/report of your choice: Report, educational, competition entry etc. (2 marks).
State your personal motivation why you have chosen this post (2 marks).
Summarize the data science problem that is being addressed, or, in case of educational posts, the data science technique that is explained (2 marks).
List the techniques and tools that have been used (4 marks).
Techniques are the actual methods, e.g. linear regression, deep learning, Principal component analysis etc.
Tools are the software packages used, e.g. numpy, scipy, scikit-learn, or Java/Hadoop
Summarise the outcome of the data science campaign (if report or competition entry), or the gained knowledge (if educational blog post) (2 marks).
Critically discuss whether you think the chosen techniques and tools are appropriate (2 marks).
Remarks
You must use a blog post from the the provided sources and specify which source you used. Using a different source will lead to downmarking by up to -7 marks.
Add a headline and your name on the top. Add a reference section to the end. The reference section must contain the full link to the blog post to , If the link to the blog post is missing or wrong your essay will be downmarked by up to 7 marks.
Write in a clear and coherent style. An essay lacking coherence and/or clarity will be downmarked by up to -7 marks.
Do not plagiarise the blog post in your essay. Any instance of plagiarism, such as copied sentences, or parts of sentences, even with nouns replaced with synonyms, will incur downmarking by up to -14 marks.
The total penalty for the content, coherence and clarity is -14 marks.
6 marks will be awarded for technical aspects.
You must add a word count to the end of your essay. A missing word count will incur -3 marks. Headline, your name, the reference section and the word count itself are excluded from the word count.
Adhere to the word count of 450 words +/- 10 %. No penalty will be incurred if the word count is between 405 and 495. Word counts beyond these limits will incur a penalty of -6 marks.
Use a spell and grammar checker. Reductions for grammar and spelling errors: -1 mark per error, max -6 marks.
The total penalty for technical aspects is -6 marks.
Rubric
Data Science Essay
Data Science Essay
Criteria
Ratings
Pts
This criterion is linked to a learning outcome
State the purpose of the blog post/report
State the purpose of the blog post/report of your choice: Report, educational, competition entry etc.
2 Pts
Achieved to full extent
1 Pts
achieved to some extent
0 Pts
not achieved
2 pts
This criterion is linked to a learning outcome
Why have you chosen this post
Your personal reasoning behind choosing this post
2 Pts
Achieved to full extent
1 Pts
achieved to some extent
0 Pts
not achieved
2 pts
This criterion is linked to a learning outcome
Data science problem or topic covered
Summarize the data science problem that is being addressed, or if educational the topic that is covered.
2 Pts
Achieved to full extent
1 Pts
achieved to some extent
0 Pts
not achieved
2 pts
This criterion is linked to a learning outcome
List the techniques and tools that have been used
Techniques are the actual methods, e.g. linear regression, deep learning, Principal component analysis etc.
Tools are the software packages used, e.g. numpy, scipy, scikit-learn, or Java/Hadoop
4 Pts
Achieved to full extent
4 to >0.0 Pts
achieved to some extent
0 Pts
not achieved
4 pts
This criterion is linked to a learning outcome
outcome of the data science campaign or gained knowledge
Summarise the outcome of the data science campaign (if report or competition entry), or the gained knowledge (if educational blog post)
2 Pts
Achieved to full extent
0 Pts
not achieved
2 pts
This criterion is linked to a learning outcome
Critical discussion
Critically discuss whether you think the chosen techniques and tools are appropriate
2 Pts
Achieved to full extent
0 Pts
not achieved
2 pts
This criterion is linked to a learning outcome
Technical aspects
You must add a word count to the end of your essay. A missing word count will incur -3 marks.
Adhere to the word count of 450 words +/- 10 %. No penalty will be incurred if the word count is between 405 and 495. Word counts beyond +/-10% will incur a penalty of -6 marks.
Use a spell and grammar checker. Reductions for grammar and spelling errors: -1 mark per error, max -6 marks.
The total penalty for technical aspects is -6 marks.
6 Pts
Some penalty for technical shortcomings
6 to >0.0 Pts
Technically sound
0 Pts
Max penalty for technical shortcomings
6 pts
Total points: 20
__MACOSX/Essay.rtfd/._TXT.rtf
7 Popular Feature Selection Routines in Machine Learning
Harini Subhasri Iragavarapu
Introduction and Personal motivation:
This report will focus on the blog post, exploring the best ways of feature selection. Usually, A general and practical dataset consists of many unnecessary features which in turn impacts the performance of the model. Building up or choosing the best features for training a robust ML model and its discussion motivated me to choose this blog post.
The Domain knowledge that relates to a particular Data Scientist or Machine Learning Engineer would help to choose the best features and also the set of variables. The datasets generally consist of the missing values that may occur due to the failure to record or data exploitation. Various techniques can be used for the imputation of missing values but those techniques don’t match the real data. So, the model trained on the features with the missing values may not yield a better performance.
The Correlation with the Target label and the correlation between the features witnesses many techniques such as Pearson, Spearman, Kendall etc. df.corr() returns with the correlation coefficient between the features. If the variables are highly correlated with the target class, they are known to be the key features. If the features or variables are not correlated with the target variables, they will not impact the model performance.
The Principle Component Analysis (PCA) is the dimensionality reduction method, which helps in extracting the features from the dataset. It uses the Matrix factorization method for the reduction into lower dimension. When the data dimensionality is high then this PCA method is used.
The Forward or the Backward feature selection helps in finding out the subset of best performing features for the ML model. The variables are selected based on the previous result interference when there are n features. This forward feature selection techniques follow: Evaluating the model performance after training, Finalizing the variables or set of features with better results and repeating this until the desired number of features are obtained.
Figure: Forward feature selection
Feature importance gives a score for each variable. It is generally the list of features and also an inbuilt function in the Sk-Learn in building up ML-models.
Conclusion:
The people with the data or domain knowledge helps in the selection of best features. Whereas coming for the missing values, the model trained on the features may not yield good performance even after incorporating the techniques for imputation. If the correlation between the features is considered, the change in one variable or feature will also impact the other variable. The PCA method reduces the dataset from using the various variables to the desired number of features. But, removing the redundant variables is a tough task here. In the forward feature selection method, first all the variables are chosen and then most redundant features are removed in each step. Feature importance scores identifies the best subset of features. By the comparative analysis of these seven techniques one can easily develop a data science model with good performance.
Reference:
Word Count: 493
We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.
Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.
Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.
Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.
Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.
Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.
We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.
Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.
You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.
Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.
Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.
You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.
You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.
Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.
We create perfect papers according to the guidelines.
We seamlessly edit out errors from your papers.
We thoroughly read your final draft to identify errors.
Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!
Dedication. Quality. Commitment. Punctuality
Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.
We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.
We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.
We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.
We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.