Friday, 29 March 2013

Assignment #10 : 26th March,2013


Assignment No 10


Problem 1:-
Ques-1 Create 3 vectors, x, y, z and choose any random values for them, ensuring they are of equal length,
T<- cbind(x,y,z)
Create 3 dimensional plot of the same (all 3 types)
Solution:-
Commands:-
> sample<-rnorm(50,25,6)
> x<-sample(sample,10)
> y<-sample(sample,10)
> z<-sample(sample,10)
> T<-cbind(x,y,z)
 Screenshots:-
Image
> plot3d(T)
Image
plot3d(T,col=rainbow(1000))
Image
plot3d(T,col=rainbow(1000),type=’s')
Image


Problem 2:-
Read the documentation of rnorm and pnorm and 
Create 2 random variables
Create 3 plots:
1. X-Y
2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories) Hint: ?factor
3. Color code and draw the graph
4. Smooth and best fit line for the curve
Solution:-
Commands:-
> x<-rnorm(200,mean=5,sd=1)
> y<-rnorm(200,mean=3,sd=1)
> z1<-sample(letters,5)
> z2<-sample(z1,200,replace=TRUE)
> z<-as.factor(z2)
> t<-cbind(x,y,z)
Screenshots:-
Image
> qplot(x,y)
Image
> qplot(x,z,alpha=I(2/10))
Image
> qplot(x,z)

Image
> qplot(x,y,geom=c(“point”,”smooth”))
Image
> qplot(x,y,colour=z)
Image
> qplot(log(x),log(y),colour=z)
Image

Saturday, 23 March 2013

Assignment no.-9 ,March 19th, 2013

Assignment -: To analyse a data visualization tools and comment on its  usage.

TABLEAU PUBLIC

The data visualization tool that i have analysed is called Tableau Public.It is a premier tool used for business intelligence (BI). It can take data from various sources such as  MS excel, MS Access, SQL Server database,Oracle database, freeware dbs such as MySQL etc.


Data In. Brilliance Out.

Tableau Public is a free data storytelling application. One can create and share interactive charts and graphs, stunning maps, live dashboards and fun applications in minutes, then publish anywhere on the web. Anyone can do it, it’s that easy—and it’s free.


Scope of this tool : This tool can turn data into any number of visualizations, from simple to complex. You can drag and drop fields onto the work area and ask the software to suggest a visualization type, then customize everything from labels and tool tips to size, interactive filters and legend display.



Uniqueness: Tableau Public offers a variety of ways to display interactive data. You can combine multiple connected visualizations onto a single dashboard, where one search filter can act on numerous charts, graphs and maps; underlying data tables can also be joined. And once you get the hang of how the software works, its drag-and-drop interface is considerably quicker than manually coding in JavaScript or R for most users, making it more likely that you'll try additional scenarios with your data set. In addition, you can easily perform calculations on data within the software.
Drawbacks: In the free version of Tableau's business intelligence software, your visualization and data must reside on Tableau's site. Whenever you save your work, it gets sent up to the public website -- which means you can't save work in progress without running the risk that it will be seen before it's ready (while Tableau's site won't deliberately expose your work, it relies on security by obscurity -- so someone could see your work if they guess your URL). And once it's saved, viewers are invited to download your entire workbook with data. Upgrading to a single-user desktop edition costs $999.
Not surprisingly, all that functionality comes at a cost: Tableau's learning curve is fairly steep compared to that of, say, Fusion Tables. Even with the drag-and-drop interface, it'll take more than an hour or two to learn how to use the software's true capabilities, although you can get up and running doing simple charts and maps before too long.
Skill level: Advanced beginner to intermediate.
Runs on: Windows 7, Vista, XP, 2003, Server 2008, 2003.




     
Tableau Desktop Public Edition is Windows software only.

System Requirements:

  • Microsoft® Windows® 8, Windows 7, Windows Vista, Windows XP, Server 2012, Server 2008, Server 2003
  • 250 megabytes minimum free disk space
  • 32-bit or 64-bit versions of Windows
  • 32-bit color depth recommended
Note: Internet Explorer 6 is not supported.

Friday, 15 March 2013

Assignment- 8TH March,2013


Assignment no:-1
Perform Panel Data Analysis of “Produc” data
Solution:
There are three types of models:
  •       Pooled affect model
  •       Fixed affect model
  •       Random affect model 
We will be determining which model is the best by using functions:
      1) pFtest : for determining between fixed and pooled
      2) plmtest : for determining between pooled and random
      3) phtest: for determining between random and fixed
The data can be loaded using the following commands:-
data(Produc , package =”plm”)
head(Produc)
Screenshot:-
Image

Pooled Affect Model 
pool <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=(“pooling”),index =c(“state”,”year”))
summary(pool)
Screenshot:-
Image

Fixed Affect Model:
 
fixed<-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=(“within”),index =c(“state”,”year”))
summary(fixed)
 
Screenshot:-
 
Image
 
 
Random Affect Model:
 
random <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=(“random”),index =c(“state”,”year”))
> summary(random)
 
Screenshot:-
 
Image
 
 
Testing of Model
 
This can be done through Hypothesis testing between the models as follows:
 
H0: Null Hypothesis: the individual index and time based params are all zero
H1: Alternate Hypothesis: atleast one of the index and time based params is non zero
 
Pooled vs Fixed
 
Null Hypothesis: Pooled Affect Model
Alternate Hypothesis : Fixed Affect Model
 
Command:
 > pFtest(fixed,pool)
 
Result:
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
F = 56.6361, df1 = 47, df2 = 761, p-value < 2.2e-16
alternative hypothesis: significant effects 
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.
 
Pooled vs Random
 
Null Hypothesis: Pooled Affect Model
Alternate Hypothesis: Random Affect Model
 
Command :
> plmtest(pool)
 
Result:
 
 Lagrange Multiplier Test – (Honda)
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
normal = 57.1686, p-value < 2.2e-16
alternative hypothesis: significant effects 
 Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.
 
 
Random vs Fixed
 
Null Hypothesis: No Correlation . Random Affect Model
Alternate Hypothesis: Fixed Affect Model
 
Command:
 > phtest(fixed,random)
 
Result: 
 
 Hausman Test
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
chisq = 93.546, df = 7, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent . 
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.
 
Conclusion: 
 
So after making all the tests we come to the conclusion that Fixed Affect Model is best suited to do the panel data analysis for “Produc” data set.
 
Hence , we conclude that within the same id i.e. within same “state” there is no variation.