The Forster transform is a method of regularizing a dataset
by placing it in {\em radial isotropic position}
while maintaining some of its essential properties.
Forster transforms have played a key role in a diverse range of settings
spanning computer science and functional analysis. Prior work had given
{\em weakly} polynomial time algorithms for computing Forster transforms, when they exist.
Our main result is the first {\em strongly polynomial time} algorithm to
compute an approximate Forster transform of a given dataset
or certify that no such transformation exists. By leveraging our strongly polynomial Forster algorithm,
we obtain the first strongly polynomial time algorithm for {\em distribution-free} PAC learning of halfspaces.
This learning result is surprising because {\em proper} PAC learning of halfspaces
is {\em equivalent} to linear programming.
Our learning approach extends to give a strongly polynomial halfspace learner
in the presence of random classification noise and, more generally, Massart noise.