In Julia, consider an array "a" having 4 values:
We find that NA is not available to use directly. To be able to use NA, we load the DataArrays package:
When we use NA ("no value / not available / null / missing value ") on regular Julia arrays, it is not allowing us to include it as an element. Hence, we need to use DataArray instead of regular Julia arrays:
Now, let us assign the value NA.
Success!
Now, Let us try to calculate mean over the elements of vector "b":
julia> mean(b)
NA
We see that Julia does not allow us to compute mean over a vector array having NA element. Hence, we compute mean by slicing the vector array "b":
julia> mean(b[2:end])
3.0
But, using this technique is very inconvenient, as - to be able to slice a vector array - it expects us to know in advance which elements are having NA values. It is better to use dropna function:
julia> mean(dropna(b))
3.0
We can see that dropna function has enabled us to overcome this drawback of slicing and also the mean is calculated over the remaining 3 elements.
$ juliaConsider the case of missing values. Now, If we want to set the element 1 in array a to "no value". How would you do that? Let us try the following:
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: https://docs.julialang.org
_ _ _| |_ __ _ | Type "?help" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.6.0 (2017-06-19 13:05 UTC)
_/ |\__'_|_|_|\__'_| | Official http://julialang.org/ release
|__/ | x86_64-pc-linux-gnu
julia> a = [1,2,3,4]
4-element Array{Int64,1}:
1
2
3
4
julia> a[1] = ""We find that, it is not possible to assign "no value" to elements in regular Julia arrays. Hence, to to represent this concept of "no value" or missing values, Julia provides this singleton object, NA. Now, let us try to use NA on regular Julia arrays:
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64
This may have arisen from a call to the constructor Int64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] setindex!(::Array{Int64,1}, ::String, ::Int64) at ./array.jl:549
julia> a[1] =
;
ERROR: syntax: unexpected ;
julia>
julia> a[1] = NA
ERROR: UndefVarError: NA not defined
We find that NA is not available to use directly. To be able to use NA, we load the DataArrays package:
julia> using DataArrays
julia> a[1] = NA
ERROR: MethodError: Cannot `convert` an object of type DataArrays.NAtype to an object of type Int64
This may have arisen from a call to the constructor Int64(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] setindex!(::Array{Int64,1}, ::DataArrays.NAtype, ::Int64) at ./array.jl:549
When we use NA ("no value / not available / null / missing value ") on regular Julia arrays, it is not allowing us to include it as an element. Hence, we need to use DataArray instead of regular Julia arrays:
julia> b = DataArray([1,2,3,4])
4-element DataArrays.DataArray{Int64,1}:
1
2
3
4
Now, let us assign the value NA.
julia> b[1] = NA
NA
julia> b
4-element DataArrays.DataArray{Int64,1}:
NA
2
3
4
julia>
Success!
Now, Let us try to calculate mean over the elements of vector "b":
julia> mean(b)
NA
We see that Julia does not allow us to compute mean over a vector array having NA element. Hence, we compute mean by slicing the vector array "b":
julia> mean(b[2:end])
3.0
But, using this technique is very inconvenient, as - to be able to slice a vector array - it expects us to know in advance which elements are having NA values. It is better to use dropna function:
julia> mean(dropna(b))
3.0
We can see that dropna function has enabled us to overcome this drawback of slicing and also the mean is calculated over the remaining 3 elements.
No comments:
Post a Comment