A visualization constructed using the vega-lite-api.
In this Boston Housing Dataset, the target variable is: medv, meaning the median value of owner-occupied homes. There is one field named rad, which is the index of accessibility to radial highways. The distribution of this variable is kind of interesting. For a group of houses, their values are the same as 24 (probably means hard to access to the highways). And for the rest houses data, the value ranges from 1 to 8. Therefore, it is naturally separate the datasets into two by this variable. I am wondering whether there is any difference between these two sub-datasets. The related 4 plots are explained below:
See Data on Gist: Housing Values at Boston Suburbs
The original data comes from the github: Boston Housing Data