Toward Real-Time On-Device WLAN Handover Control via Prompt Refinement <Abstract> Abstract?In Wireless Local Area Networks (WLANs), handover (HO) decision logic is a primary determinant of Quality of Service (QoS). Conventional mechanisms relying on static Received Signal Strength Indicator (RRSSI ) thresholds often suffer from the ping-pong effect or sticky client problems due to a lack of semantic context. While Large Language Models (LLMs) offer zero-shot reasoning capabilities to address these rigidities, deploying multi-billion parameter models on resourceconstrained edge devices introduces severe inference latency (Tinf ) challenges. This paper investigates the feasibility of deploying 2-bit quantized Llama-3.1-8B locally as a supervisory handover controller. We perform a rigorous sensitivity analysis of prompt engineering strategies, contrasting heuristic attention biasing against a teacher-guided prompt refinement strategy distilled by a superior reasoning agent (Gemini). Experimental results on a GPU-accelerated edge surrogate reveal a critical Pareto frontier between signal maximization and system feasibility. We observe that while attention-biased prompts prioritize aggressive RRSSI maximization (?gain ? +4.0 dB), they induce excessive token generation (Tinf ? 48 s), violating the real-time control loop constraints. Conversely, teacher-guided prompts successfully inject logical hysteresis constraints, suppressing redundant handovers and reducing inference latency by approximately 28% (Tinf ? 18 s). These findings demonstrate that structural prompt optimization is not merely a linguistic tuning tool, but a requisite mechanism for enabling real-time Edge AI mobility management. |