Friday, 29 March 2013

Assignment #10 : 26th March,2013

Assignment No 10

Problem 1:-

Ques-1 Create 3 vectors, x, y, z and choose any random values for them, ensuring they are of equal length,
T<- cbind(x,y,z)
Create 3 dimensional plot of the same (all 3 types)

Solution:-

Commands:-

> sample<-rnorm(50,25,6)

> x<-sample(sample,10)

> y<-sample(sample,10)

> z<-sample(sample,10)

> T<-cbind(x,y,z)

Screenshots:-

> plot3d(T)

plot3d(T,col=rainbow(1000))

plot3d(T,col=rainbow(1000),type=’s')

Problem 2:-

Read the documentation of rnorm and pnorm and

Create 2 random variables
Create 3 plots:
1. X-Y
2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories) Hint: ?factor
3. Color code and draw the graph
4. Smooth and best fit line for the curve

Solution:-

Commands:-

> x<-rnorm(200,mean=5,sd=1)
> y<-rnorm(200,mean=3,sd=1)
> z1<-sample(letters,5)
> z2<-sample(z1,200,replace=TRUE)
> z<-as.factor(z2)
> t<-cbind(x,y,z)

Screenshots:-

> qplot(x,y)

> qplot(x,z,alpha=I(2/10))

> qplot(x,z)

> qplot(x,y,geom=c(“point”,”smooth”))

> qplot(x,y,colour=z)

> qplot(log(x),log(y),colour=z)

Saturday, 23 March 2013

Assignment no.-9 ,March 19th, 2013

Assignment -: To analyse a data visualization tools and comment on its usage.

TABLEAU PUBLIC

The data visualization tool that i have analysed is called Tableau Public.It is a premier tool used for business intelligence (BI). It can take data from various sources such as MS excel, MS Access, SQL Server database,Oracle database, freeware dbs such as MySQL etc.

Data In. Brilliance Out.

Tableau Public is a free data storytelling application. One can create and share interactive charts and graphs, stunning maps, live dashboards and fun applications in minutes, then publish anywhere on the web. Anyone can do it, it’s that easy—and it’s free.

Scope of this tool : This tool can turn data into any number of visualizations, from simple to complex. You can drag and drop fields onto the work area and ask the software to suggest a visualization type, then customize everything from labels and tool tips to size, interactive filters and legend display.

Uniqueness: Tableau Public offers a variety of ways to display interactive data. You can combine multiple connected visualizations onto a single dashboard, where one search filter can act on numerous charts, graphs and maps; underlying data tables can also be joined. And once you get the hang of how the software works, its drag-and-drop interface is considerably quicker than manually coding in JavaScript or R for most users, making it more likely that you'll try additional scenarios with your data set. In addition, you can easily perform calculations on data within the software.

Drawbacks: In the free version of Tableau's business intelligence software, your visualization and data must reside on Tableau's site. Whenever you save your work, it gets sent up to the public website -- which means you can't save work in progress without running the risk that it will be seen before it's ready (while Tableau's site won't deliberately expose your work, it relies on security by obscurity -- so someone could see your work if they guess your URL). And once it's saved, viewers are invited to download your entire workbook with data. Upgrading to a single-user desktop edition costs $999.

Not surprisingly, all that functionality comes at a cost: Tableau's learning curve is fairly steep compared to that of, say, Fusion Tables. Even with the drag-and-drop interface, it'll take more than an hour or two to learn how to use the software's true capabilities, although you can get up and running doing simple charts and maps before too long.

Skill level: Advanced beginner to intermediate.

Runs on: Windows 7, Vista, XP, 2003, Server 2008, 2003.

Tableau Desktop Public Edition is Windows software only.

System Requirements:

Microsoft® Windows® 8, Windows 7, Windows Vista, Windows XP, Server 2012, Server 2008, Server 2003
250 megabytes minimum free disk space
32-bit or 64-bit versions of Windows
32-bit color depth recommended

Note: Internet Explorer 6 is not supported.

Friday, 15 March 2013

Assignment- 8TH March,2013

Assignment no:-1

Perform Panel Data Analysis of “Produc” data

Solution:

There are three types of models:

Pooled affect model
Fixed affect model
Random affect model

We will be determining which model is the best by using functions:
1) pFtest : for determining between fixed and pooled
2) plmtest : for determining between pooled and random
3) phtest: for determining between random and fixed

The data can be loaded using the following commands:-
data(Produc , package =”plm”)
head(Produc)

Screenshot:-

Pooled Affect Model

pool <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=(“pooling”),index =c(“state”,”year”))
summary(pool)

Screenshot:-

Fixed Affect Model:

fixed<-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=(“within”),index =c(“state”,”year”))

summary(fixed)

Screenshot:-

Random Affect Model:

random <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=(“random”),index =c(“state”,”year”))

> summary(random)

Screenshot:-

Testing of Model

This can be done through Hypothesis testing between the models as follows:

H0: Null Hypothesis: the individual index and time based params are all zero

H1: Alternate Hypothesis: atleast one of the index and time based params is non zero

Pooled vs Fixed

Null Hypothesis: Pooled Affect Model

Alternate Hypothesis : Fixed Affect Model

Command:

> pFtest(fixed,pool)

Result:

data: log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
F = 56.6361, df1 = 47, df2 = 761, p-value < 2.2e-16
alternative hypothesis: significant effects

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.

Pooled vs Random

Null Hypothesis: Pooled Affect Model

Alternate Hypothesis: Random Affect Model

Command :

> plmtest(pool)

Result:

Lagrange Multiplier Test – (Honda)

data: log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)

normal = 57.1686, p-value < 2.2e-16
alternative hypothesis: significant effects

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.

Random vs Fixed

Null Hypothesis: No Correlation . Random Affect Model

Alternate Hypothesis: Fixed Affect Model

Command:

> phtest(fixed,random)

Result:

Hausman Test

data: log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)

chisq = 93.546, df = 7, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent .

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.

Conclusion:

So after making all the tests we come to the conclusion that Fixed Affect Model is best suited to do the panel data analysis for “Produc” data set.

Hence , we conclude that within the same id i.e. within same “state” there is no variation.

BIS- LABS

Friday, 29 March 2013

Assignment #10 : 26th March,2013

Assignment No 10

Saturday, 23 March 2013

Assignment no.-9 ,March 19th, 2013

Data In. Brilliance Out.

System Requirements:

Friday, 15 March 2013

Assignment- 8TH March,2013

About Me