Text this: An optimized data analytics pipeline for improving healthcare diagnosis using ensemble learning